[mpich-discuss] too many communicators

Jeff Hammond jhammond at alcf.anl.gov
Thu Dec 13 12:19:44 CST 2012


It's far from conclusive, but I went through the SuperLU_dist source
and it seems there is proper symmetry between communicator creation
and destruction.

$ for f in create dup split free ; do grep MPI_Comm_$f *.c ; done
superlu_grid.c:    MPI_Comm_create( Bcomm, superlu_grp, &grid->comm );
psymbfact.c:  MPI_Comm_dup ((*symb_comm), &(commLvls[i]));
pdgssvx.c:	    MPI_Comm_split (grid->comm, col, key, &symb_comm );
psymbfact.c:    MPI_Comm_split ((*symb_comm), col, key, &(commLvls[ind]) );
pzgssvx.c:	    MPI_Comm_split (grid->comm, col, key, &symb_comm );
superlu_grid.c:    MPI_Comm_split(grid->comm, myrow, mycol, &(grid->rscp.comm));
superlu_grid.c:    MPI_Comm_split(grid->comm, mycol, myrow, &(grid->cscp.comm));
pdgssvx.c:	  MPI_Comm_free (&symb_comm);
psymbfact.c:  MPI_Comm_free (&(commLvls[i]));
psymbfact.c:    MPI_Comm_free ( &(commLvls[ind]) );
pzgssvx.c:	  MPI_Comm_free (&symb_comm);
superlu_grid.c:	MPI_Comm_free( &grid->rscp.comm );
superlu_grid.c:	MPI_Comm_free( &grid->cscp.comm );
superlu_grid.c:	MPI_Comm_free( &grid->comm );

Jeff

On Thu, Dec 13, 2012 at 9:40 AM, Jim Dinan <dinan at mcs.anl.gov> wrote:
> Hi Jack,
>
> It sounds like your application, or maybe the solver, is leaking
> communicators somewhere.  If you configure with --enable-g=handlealloc (or
> --enable-g=all), MPICH will issue warnings about leaked MPI objects when
> your program exits.
>
>  ~Jim.
>
>
> On 12/12/12 6:47 PM, Dave Goodell wrote:
>>
>> On Dec 13, 2012, at 5:23 AM GMT+09:00, Jack Lee wrote:
>>
>>> What I'd like to determine first is whether fault is on my side (e.g.
>>> perhaps I'm not calling the clean-up routines properly). Is there a way to
>>> find out how many context id's are in use at a given point?
>>
>>
>> If your MPICH was configured with "--enable-g=log" (or a superset thereof,
>> such as "=all"), then you can add the following prototype into your
>> application and invoke this function to get a string showing the current
>> bit-vector state on a given process:
>>
>> ----8<----
>> static char *MPIR_ContextMaskToStr(void);
>> ----8<----
>>
>> see src/mpi/comm/commutil.c for more insight into how this works and what
>> it's actually telling you.  This wiki node also has some useful (and only
>> slightly stale) information:
>>
>> http://wiki.mpich.org/mpich/index.php/Communicators_and_Context_IDs
>>
>> I wish I had a better option for you, but without just rolling up your
>> sleeves and hacking on the MPICH code, this is about as good as you're
>> likely to get.
>>
>> The next best approach would probably be to trap all the communicator
>> creation/destruction calls that you can think of (see the MPI-2.2 standard,
>> chapter 6) via the PMPI_ profiling interface and log the invocations to look
>> for a mismatch.
>>
>> -Dave
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss



-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond



More information about the discuss mailing list