[mpich-discuss] too many communicators

Jim Dinan dinan at mcs.anl.gov
Thu Dec 13 12:31:46 CST 2012


How many communicators are created by the loops containing these 
statements?  Could it be more than the limit?

 > psymbfact.c:  MPI_Comm_dup ((*symb_comm), &(commLvls[i]));
 > psymbfact.c:    MPI_Comm_split ((*symb_comm), col, key, 
&(commLvls[ind]) );

  ~Jim.

On 12/13/12 12:19 PM, Jeff Hammond wrote:
> It's far from conclusive, but I went through the SuperLU_dist source
> and it seems there is proper symmetry between communicator creation
> and destruction.
>
> $ for f in create dup split free ; do grep MPI_Comm_$f *.c ; done
> superlu_grid.c:    MPI_Comm_create( Bcomm, superlu_grp, &grid->comm );
> psymbfact.c:  MPI_Comm_dup ((*symb_comm), &(commLvls[i]));
> pdgssvx.c:	    MPI_Comm_split (grid->comm, col, key, &symb_comm );
> psymbfact.c:    MPI_Comm_split ((*symb_comm), col, key, &(commLvls[ind]) );
> pzgssvx.c:	    MPI_Comm_split (grid->comm, col, key, &symb_comm );
> superlu_grid.c:    MPI_Comm_split(grid->comm, myrow, mycol, &(grid->rscp.comm));
> superlu_grid.c:    MPI_Comm_split(grid->comm, mycol, myrow, &(grid->cscp.comm));
> pdgssvx.c:	  MPI_Comm_free (&symb_comm);
> psymbfact.c:  MPI_Comm_free (&(commLvls[i]));
> psymbfact.c:    MPI_Comm_free ( &(commLvls[ind]) );
> pzgssvx.c:	  MPI_Comm_free (&symb_comm);
> superlu_grid.c:	MPI_Comm_free( &grid->rscp.comm );
> superlu_grid.c:	MPI_Comm_free( &grid->cscp.comm );
> superlu_grid.c:	MPI_Comm_free( &grid->comm );
>
> Jeff
>
> On Thu, Dec 13, 2012 at 9:40 AM, Jim Dinan <dinan at mcs.anl.gov> wrote:
>> Hi Jack,
>>
>> It sounds like your application, or maybe the solver, is leaking
>> communicators somewhere.  If you configure with --enable-g=handlealloc (or
>> --enable-g=all), MPICH will issue warnings about leaked MPI objects when
>> your program exits.
>>
>>   ~Jim.
>>
>>
>> On 12/12/12 6:47 PM, Dave Goodell wrote:
>>>
>>> On Dec 13, 2012, at 5:23 AM GMT+09:00, Jack Lee wrote:
>>>
>>>> What I'd like to determine first is whether fault is on my side (e.g.
>>>> perhaps I'm not calling the clean-up routines properly). Is there a way to
>>>> find out how many context id's are in use at a given point?
>>>
>>>
>>> If your MPICH was configured with "--enable-g=log" (or a superset thereof,
>>> such as "=all"), then you can add the following prototype into your
>>> application and invoke this function to get a string showing the current
>>> bit-vector state on a given process:
>>>
>>> ----8<----
>>> static char *MPIR_ContextMaskToStr(void);
>>> ----8<----
>>>
>>> see src/mpi/comm/commutil.c for more insight into how this works and what
>>> it's actually telling you.  This wiki node also has some useful (and only
>>> slightly stale) information:
>>>
>>> http://wiki.mpich.org/mpich/index.php/Communicators_and_Context_IDs
>>>
>>> I wish I had a better option for you, but without just rolling up your
>>> sleeves and hacking on the MPICH code, this is about as good as you're
>>> likely to get.
>>>
>>> The next best approach would probably be to trap all the communicator
>>> creation/destruction calls that you can think of (see the MPI-2.2 standard,
>>> chapter 6) via the PMPI_ profiling interface and log the invocations to look
>>> for a mismatch.
>>>
>>> -Dave
>>>
>>> _______________________________________________
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>



More information about the discuss mailing list