[mpich-discuss] MCS lock and MPI RMA problem

Balaji, Pavan balaji at anl.gov
Tue Mar 14 16:54:54 CDT 2017


> On Mar 14, 2017, at 4:46 PM, Balaji, Pavan <balaji at anl.gov> wrote:
> Thanks.  That's a bug in your code.  In mcs-lock-fop.c:92, before you do an accumulate to notify the next rank, you need to reset your lmem[nextRank] back to -1.  Otherwise, in the next iteration, you'll think that the value is set even if it is not.  You can either do it using a local store followed by an MPI_WIN_SYNC or using MPI_Put or MPI_Accumulate.  After that fix your program seems to work correctly.

A small clarification: I think you noticed the reset issue in your previous email as well, but setting it before acquire would be incorrect because of the reason I already pointed out.  Setting it before the release completes, however, would be correct because it's within the epoch and does not conflict with the MODE_NOCHECK hint (you are algorithmically guaranteeing that there's no conflicting access to it).

An additional small possible improvement:

you can move the lock_all to the init function instead of doing it for each lock acquisition.  That way you can reduce the number of locks.  With the MODE_NOCHECK hint the lock_all is essentially a no-op with respect to lock acquisition, but it still does additional memory barriers which can be avoided.

  -- Pavan

_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list