[mpich-devel] odd timings in type create/free (related to handle pool?)

Jeff Hammond jeff.science at gmail.com
Sun Oct 5 19:50:19 CDT 2014


Hi Bill,

Indeed, you're right.  My benchmark unnecessarily beats on the
flattening code.  The intended use (in BigMPI) will not (because the
chunk size is INT_MAX and thus one needs quite a bit of DRAM to
require more than 10-way flattening).

As for the flattening code, why not take a lazy approach and only
flatten datatypes upon their first use by RMA?  MPICH would need to
have the ability to see them in both representations, but at least
you'd avoid unnecessary flattening in the common case of 2-sided and I
doubt that RMA with user-defined datatypes would feel the pain that
much anyways.  ARMCI-MPI is perhaps the world's largest consumer of
this feature and it will avoid this issue except in some highly
unlikely (and perhaps strictly not-by-default) cases.

Best,

Jeff

On Sat, Oct 4, 2014 at 4:15 PM, William Gropp <wgropp at illinois.edu> wrote:
> Jeff,
>
> You are creating different datatypes with each call - if you call the constructor with the same argument (e.g., “10” instead of “i”), the time is just about constant.  The time appears to be proportional to the size, and I think it is due to the unconditional flattening of the struct representation, which is a known problem.  For this datatype, type flattened representation is inefficient, and it uses huge amounts of memory for large “i”.  Unfortunately, the current RMA code incorrectly assumes a flattened representation, so fixing this has turned out to be more involved that expected.
>
> Bill
>
> On Oct 3, 2014, at 5:36 PM, Jeff Hammond <jeff.science at gmail.com> wrote:
>
>> I wanted to time how long it took to create a datatype.  Obviously,
>> timing a series of calls is the normal way to get reasonable data.
>> However, I find that my test shows that the time per call is in some
>> way proportional to the number of calls in the series, even when I
>> reuse the same handle.  I previously timed on a vector of handles and
>> saw the same result.
>>
>> I can only assume this is related to how MPICH does handle allocation
>> internally.  Can you confirm?  Is there any way to get MPICH to
>> garbage collect the internal handle pool so that the time per call
>> goes back down again?  An increase from 4 us to 112 us per call is
>> pretty substantial if I have a library that is going to use a lot of
>> derived datatypes and has no reasonable way to cache them.
>>
>> Thanks,
>>
>> Jeff
>>
>> OUTPUT
>>
>> create, commit (free?) 100 Type_contig_x in 0.000393 s (3.926220 us per call)
>> create, commit (free?) 1000 Type_contig_x in 0.006105 s (6.105444 us per call)
>> create, commit (free?) 10000 Type_contig_x in 0.082496 s (8.249623 us per call)
>> create, commit (free?) 25000 Type_contig_x in 0.341852 s (13.674085 us per call)
>> create, commit (free?) 50000 Type_contig_x in 1.280882 s (25.617630 us per call)
>> create, commit (free?) 100000 Type_contig_x in 4.565911 s (45.659108
>> us per call)
>> create, commit (free?) 250000 Type_contig_x in 27.989672 s (111.958686
>> us per call)
>>
>>
>> SOURCE
>>
>> #include <stdint.h>
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <string.h>
>> #include <limits.h>
>> #include <math.h>
>> #include <mpi.h>
>>
>> /* there is a reason for this silliness */
>> static volatile int bigmpi_int_max = 3;
>>
>> int MPIX_Type_contiguous_x(MPI_Count count, MPI_Datatype oldtype,
>> MPI_Datatype * newtype)
>> {
>>    MPI_Count c = count/bigmpi_int_max;
>>    MPI_Count r = count%bigmpi_int_max;
>>
>>    MPI_Datatype chunks;
>>    MPI_Type_vector(c, bigmpi_int_max, bigmpi_int_max, oldtype, &chunks);
>>
>>    MPI_Datatype remainder;
>>    MPI_Type_contiguous(r, oldtype, &remainder);
>>
>>    MPI_Aint lb /* unused */, extent;
>>    MPI_Type_get_extent(oldtype, &lb, &extent);
>>
>>    MPI_Aint remdisp          = (MPI_Aint)c*bigmpi_int_max*extent;
>>    int blocklengths[2]       = {1,1};
>>    MPI_Aint displacements[2] = {0,remdisp};
>>    MPI_Datatype types[2]     = {chunks,remainder};
>>    MPI_Type_create_struct(2, blocklengths, displacements, types, newtype);
>>
>>    MPI_Type_free(&chunks);
>>    MPI_Type_free(&remainder);
>>
>>    return MPI_SUCCESS;
>> }
>>
>> int main(int argc, char* argv[])
>> {
>>    int rank=0, size=1;
>>    MPI_Init(&argc, &argv);
>>    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>    MPI_Comm_size(MPI_COMM_WORLD, &size);
>>
>>    int n = (argc>1) ? atoi(argv[1]) : 10000;
>>    //MPI_Datatype * dtout = malloc(n*sizeof(MPI_Datatype));
>>    MPI_Datatype dtout;
>>    double t0 = MPI_Wtime();
>>    for (int i=0; i<n; i++) {
>>        //MPIX_Type_contiguous_x((MPI_Count)i, MPI_DOUBLE, &(dtout[i]));
>>        //MPI_Type_commit(&(dtout[i]));
>>        MPIX_Type_contiguous_x((MPI_Count)i, MPI_DOUBLE, &dtout);
>>        MPI_Type_commit(&dtout);
>>        MPI_Type_free(&dtout);
>>    }
>>    double t1 = MPI_Wtime();
>>    double dt = t1-t0;
>>    printf("create, commit (free?) %d Type_contig_x in %lf s (%lf us
>> per call)\n",
>>            n, dt, 1.e6*dt/(double)n);
>>
>>    //for (int i=0; i<n; i++) {
>>    //    MPI_Type_free(&(dtout[i]));
>>    //}
>>    //free(dtout);
>>
>>    MPI_Finalize();
>>    return 0;
>> }
>>
>>
>> --
>> Jeff Hammond
>> jeff.science at gmail.com
>> http://jeffhammond.github.io/
>> _______________________________________________
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/devel
>
> _______________________________________________
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/devel



-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/


More information about the devel mailing list