[mpich-devel] odd timings in type create/free (related to handle pool?)

Jeff Hammond jeff.science at gmail.com
Fri Oct 3 17:36:14 CDT 2014


I wanted to time how long it took to create a datatype.  Obviously,
timing a series of calls is the normal way to get reasonable data.
However, I find that my test shows that the time per call is in some
way proportional to the number of calls in the series, even when I
reuse the same handle.  I previously timed on a vector of handles and
saw the same result.

I can only assume this is related to how MPICH does handle allocation
internally.  Can you confirm?  Is there any way to get MPICH to
garbage collect the internal handle pool so that the time per call
goes back down again?  An increase from 4 us to 112 us per call is
pretty substantial if I have a library that is going to use a lot of
derived datatypes and has no reasonable way to cache them.

Thanks,

Jeff

OUTPUT

create, commit (free?) 100 Type_contig_x in 0.000393 s (3.926220 us per call)
create, commit (free?) 1000 Type_contig_x in 0.006105 s (6.105444 us per call)
create, commit (free?) 10000 Type_contig_x in 0.082496 s (8.249623 us per call)
create, commit (free?) 25000 Type_contig_x in 0.341852 s (13.674085 us per call)
create, commit (free?) 50000 Type_contig_x in 1.280882 s (25.617630 us per call)
create, commit (free?) 100000 Type_contig_x in 4.565911 s (45.659108
us per call)
create, commit (free?) 250000 Type_contig_x in 27.989672 s (111.958686
us per call)


SOURCE

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <math.h>
#include <mpi.h>

/* there is a reason for this silliness */
static volatile int bigmpi_int_max = 3;

int MPIX_Type_contiguous_x(MPI_Count count, MPI_Datatype oldtype,
MPI_Datatype * newtype)
{
    MPI_Count c = count/bigmpi_int_max;
    MPI_Count r = count%bigmpi_int_max;

    MPI_Datatype chunks;
    MPI_Type_vector(c, bigmpi_int_max, bigmpi_int_max, oldtype, &chunks);

    MPI_Datatype remainder;
    MPI_Type_contiguous(r, oldtype, &remainder);

    MPI_Aint lb /* unused */, extent;
    MPI_Type_get_extent(oldtype, &lb, &extent);

    MPI_Aint remdisp          = (MPI_Aint)c*bigmpi_int_max*extent;
    int blocklengths[2]       = {1,1};
    MPI_Aint displacements[2] = {0,remdisp};
    MPI_Datatype types[2]     = {chunks,remainder};
    MPI_Type_create_struct(2, blocklengths, displacements, types, newtype);

    MPI_Type_free(&chunks);
    MPI_Type_free(&remainder);

    return MPI_SUCCESS;
}

int main(int argc, char* argv[])
{
    int rank=0, size=1;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    int n = (argc>1) ? atoi(argv[1]) : 10000;
    //MPI_Datatype * dtout = malloc(n*sizeof(MPI_Datatype));
    MPI_Datatype dtout;
    double t0 = MPI_Wtime();
    for (int i=0; i<n; i++) {
        //MPIX_Type_contiguous_x((MPI_Count)i, MPI_DOUBLE, &(dtout[i]));
        //MPI_Type_commit(&(dtout[i]));
        MPIX_Type_contiguous_x((MPI_Count)i, MPI_DOUBLE, &dtout);
        MPI_Type_commit(&dtout);
        MPI_Type_free(&dtout);
    }
    double t1 = MPI_Wtime();
    double dt = t1-t0;
    printf("create, commit (free?) %d Type_contig_x in %lf s (%lf us
per call)\n",
            n, dt, 1.e6*dt/(double)n);

    //for (int i=0; i<n; i++) {
    //    MPI_Type_free(&(dtout[i]));
    //}
    //free(dtout);

    MPI_Finalize();
    return 0;
}


-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/


More information about the devel mailing list