[mpich-devel] odd timings in type create/free (related to handle pool?)

William Gropp wgropp at illinois.edu
Sat Oct 4 18:15:32 CDT 2014


Jeff,

You are creating different datatypes with each call - if you call the constructor with the same argument (e.g., “10” instead of “i”), the time is just about constant.  The time appears to be proportional to the size, and I think it is due to the unconditional flattening of the struct representation, which is a known problem.  For this datatype, type flattened representation is inefficient, and it uses huge amounts of memory for large “i”.  Unfortunately, the current RMA code incorrectly assumes a flattened representation, so fixing this has turned out to be more involved that expected.

Bill

On Oct 3, 2014, at 5:36 PM, Jeff Hammond <jeff.science at gmail.com> wrote:

> I wanted to time how long it took to create a datatype.  Obviously,
> timing a series of calls is the normal way to get reasonable data.
> However, I find that my test shows that the time per call is in some
> way proportional to the number of calls in the series, even when I
> reuse the same handle.  I previously timed on a vector of handles and
> saw the same result.
> 
> I can only assume this is related to how MPICH does handle allocation
> internally.  Can you confirm?  Is there any way to get MPICH to
> garbage collect the internal handle pool so that the time per call
> goes back down again?  An increase from 4 us to 112 us per call is
> pretty substantial if I have a library that is going to use a lot of
> derived datatypes and has no reasonable way to cache them.
> 
> Thanks,
> 
> Jeff
> 
> OUTPUT
> 
> create, commit (free?) 100 Type_contig_x in 0.000393 s (3.926220 us per call)
> create, commit (free?) 1000 Type_contig_x in 0.006105 s (6.105444 us per call)
> create, commit (free?) 10000 Type_contig_x in 0.082496 s (8.249623 us per call)
> create, commit (free?) 25000 Type_contig_x in 0.341852 s (13.674085 us per call)
> create, commit (free?) 50000 Type_contig_x in 1.280882 s (25.617630 us per call)
> create, commit (free?) 100000 Type_contig_x in 4.565911 s (45.659108
> us per call)
> create, commit (free?) 250000 Type_contig_x in 27.989672 s (111.958686
> us per call)
> 
> 
> SOURCE
> 
> #include <stdint.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <limits.h>
> #include <math.h>
> #include <mpi.h>
> 
> /* there is a reason for this silliness */
> static volatile int bigmpi_int_max = 3;
> 
> int MPIX_Type_contiguous_x(MPI_Count count, MPI_Datatype oldtype,
> MPI_Datatype * newtype)
> {
>    MPI_Count c = count/bigmpi_int_max;
>    MPI_Count r = count%bigmpi_int_max;
> 
>    MPI_Datatype chunks;
>    MPI_Type_vector(c, bigmpi_int_max, bigmpi_int_max, oldtype, &chunks);
> 
>    MPI_Datatype remainder;
>    MPI_Type_contiguous(r, oldtype, &remainder);
> 
>    MPI_Aint lb /* unused */, extent;
>    MPI_Type_get_extent(oldtype, &lb, &extent);
> 
>    MPI_Aint remdisp          = (MPI_Aint)c*bigmpi_int_max*extent;
>    int blocklengths[2]       = {1,1};
>    MPI_Aint displacements[2] = {0,remdisp};
>    MPI_Datatype types[2]     = {chunks,remainder};
>    MPI_Type_create_struct(2, blocklengths, displacements, types, newtype);
> 
>    MPI_Type_free(&chunks);
>    MPI_Type_free(&remainder);
> 
>    return MPI_SUCCESS;
> }
> 
> int main(int argc, char* argv[])
> {
>    int rank=0, size=1;
>    MPI_Init(&argc, &argv);
>    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>    MPI_Comm_size(MPI_COMM_WORLD, &size);
> 
>    int n = (argc>1) ? atoi(argv[1]) : 10000;
>    //MPI_Datatype * dtout = malloc(n*sizeof(MPI_Datatype));
>    MPI_Datatype dtout;
>    double t0 = MPI_Wtime();
>    for (int i=0; i<n; i++) {
>        //MPIX_Type_contiguous_x((MPI_Count)i, MPI_DOUBLE, &(dtout[i]));
>        //MPI_Type_commit(&(dtout[i]));
>        MPIX_Type_contiguous_x((MPI_Count)i, MPI_DOUBLE, &dtout);
>        MPI_Type_commit(&dtout);
>        MPI_Type_free(&dtout);
>    }
>    double t1 = MPI_Wtime();
>    double dt = t1-t0;
>    printf("create, commit (free?) %d Type_contig_x in %lf s (%lf us
> per call)\n",
>            n, dt, 1.e6*dt/(double)n);
> 
>    //for (int i=0; i<n; i++) {
>    //    MPI_Type_free(&(dtout[i]));
>    //}
>    //free(dtout);
> 
>    MPI_Finalize();
>    return 0;
> }
> 
> 
> -- 
> Jeff Hammond
> jeff.science at gmail.com
> http://jeffhammond.github.io/
> _______________________________________________
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/devel



More information about the devel mailing list