[mpich-discuss] [PATCH v2] test: add attrdeleteget, MPI_Attr_get called from delete_fn
Pavan Balaji
balaji at mcs.anl.gov
Sat May 18 16:33:01 CDT 2013
On 05/18/2013 04:19 PM US Central Time, Fab Tillier wrote:
> Pavan Balaji wrote on Sat, 18 May 2013 at 14:02:17
>
>> In MPICH, here's what we do:
>>
>> 1. Run the user callback. If it returned an error, return that error to
>> the user.
>
> So you don't actually free the attribute if the callback succeeds?
If the callback succeeds and the ref-count on the communicator is zero,
we free the attributes.
> What happens if the callback fails, but some preceding attributes'
> callbacks succeeded? If you wanted to provide some level of fault
> tolerance (since attribute delete callbacks could be recoverable
> errors), what happens on a subsequent call to MPI_COMM_FREE? Do the
> attributes that had their callback called successfully get a second
> delete call?
Whoops, I realized my previous list had a small typo. I meant to say
that the ref-count for the attribute keyval is reduced on a delete (that
should not be guarded by the comm ref-count == 0). Here's the corrected
version:
1. Run the user callback. If it returned an error, return that error to
the user.
2. Decrement the attribute keyval ref-count.
- If the ref-count has reached zero, free the keyval.
3. Decrement the communicator ref-count.
- If the ref-count has reached zero, free the communicator.
> It seems that once an attribute delete callback returns success, that
> attribute should be deleted (that is, you don't need to delay the
> deletion until the ref count of the communicator reaches zero).
Yes.
>> If the user did a MPI_BCAST, there's no problem since after the
>> callback, the communicator ref-count does not include the BCAST. On the
>> other hand, if the user did an MPI_IBCAST, then the ref-count will not
>> touch zero, so the communicator is not actually deleted internally.
>>
>> The attribute might not actually get deleted if the ref-count didn't
>> reach zero, but you can't access them anyway, since the comm handle if
>> not valid. They'll eventually get freed when the ref-count reaches zero.
>
> Right, so why not just delete the attribute from the context of the
> MPI_COMM_FREE call, rather than delaying until the ref count reaches
> zero? I didn't think attributes could affect internal MPI
> operations, so thought they could be freed early.
We check for a ref-count on the attributes. It does decrement the
ref-count and might free it.
-- Pavan
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the discuss
mailing list