[mpich-discuss] [PATCH v2] test: add attrdeleteget, MPI_Attr_get called from delete_fn

Pavan Balaji balaji at mcs.anl.gov
Sat May 18 16:33:01 CDT 2013


On 05/18/2013 04:19 PM US Central Time, Fab Tillier wrote:
> Pavan Balaji wrote on Sat, 18 May 2013 at 14:02:17
> 
>> In MPICH, here's what we do:
>>
>> 1. Run the user callback.  If it returned an error, return that error to
>> the user.
> 
> So you don't actually free the attribute if the callback succeeds?

If the callback succeeds and the ref-count on the communicator is zero,
we free the attributes.

> What happens if the callback fails, but some preceding attributes'
> callbacks succeeded?  If you wanted to provide some level of fault
> tolerance (since attribute delete callbacks could be recoverable
> errors), what happens on a subsequent call to MPI_COMM_FREE?  Do the
> attributes that had their callback called successfully get a second
> delete call?

Whoops, I realized my previous list had a small typo.  I meant to say
that the ref-count for the attribute keyval is reduced on a delete (that
should not be guarded by the comm ref-count == 0).  Here's the corrected
version:

1. Run the user callback.  If it returned an error, return that error to
the user.

2. Decrement the attribute keyval ref-count.
   - If the ref-count has reached zero, free the keyval.

3. Decrement the communicator ref-count.
   - If the ref-count has reached zero, free the communicator.

> It seems that once an attribute delete callback returns success, that
> attribute should be deleted (that is, you don't need to delay the
> deletion until the ref count of the communicator reaches zero).

Yes.

>> If the user did a MPI_BCAST, there's no problem since after the
>> callback, the communicator ref-count does not include the BCAST.  On the
>> other hand, if the user did an MPI_IBCAST, then the ref-count will not
>> touch zero, so the communicator is not actually deleted internally.
>>
>> The attribute might not actually get deleted if the ref-count didn't
>> reach zero, but you can't access them anyway, since the comm handle if
>> not valid.  They'll eventually get freed when the ref-count reaches zero.
> 
> Right, so why not just delete the attribute from the context of the
> MPI_COMM_FREE call, rather than delaying until the ref count reaches
> zero?  I didn't think attributes could affect internal MPI
> operations, so thought they could be freed early.

We check for a ref-count on the attributes.  It does decrement the
ref-count and might free it.

 -- Pavan

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji



More information about the discuss mailing list