[mpich-discuss] MCS lock and MPI RMA problem

Halim Amer aamer at anl.gov
Mon Mar 6 17:44:19 CST 2017


My understanding is that this code assumes asynchronous progress.
An example of when the processes hang is as follows:

1) P0 Finishes MCSLockAcquire()
2) P1 is busy waiting in MCSLockAcquire() at
do {
       MPI_Win_sync(win);
    } while (lmem[blocked] == 1);
3) P0 executes MCSLockRelease()
4) P0 waits on MPI_Win_lock_all() inside MCSLockRlease()

Hang!

For P1 to get out of the loop, P0 has to get out of MPI_Win_lock_all() 
and executes its Compare_and_swap().

For P0 to get out MPI_Win_lock_all(), it needs an ACK from P1 that it 
got the lock.

P1 does not make communication progress because MPI_Win_sync is not 
required to do so. It only synchronizes private and public copies.

For this hang to disappear, one can either trigger progress manually by 
using heavy-duty synchronization calls instead of Win_sync (e.g., 
Win_unlock_all + Win_lock_all), or enable asynchronous progress.

To enable asynchronous progress in MPICH, set the 
MPIR_CVAR_ASYNC_PROGRESS env var to 1.

Halim
www.mcs.anl.gov/~aamer

On 3/6/17 1:11 PM, Ask Jakobsen wrote:
>  I am testing on x86_64 platform.
>
> I have tried to built both the mpich and the mcs lock code with -O0 to
> avoid agressive optimization. After your suggestion I have also tried to
> make volatile int *pblocked pointing to lmem[blocked] in the MCSLockAcquire
> function and volatile int *pnextrank pointing to lmem[nextRank] in
> MCSLockRelease, but it does not appear to make a difference.
>
> On suggestion from Richard Warren I have also tried building the code using
> openmpi-2.0.2 without any luck (however it appears to acquire the lock a
> couple of extra times before failing) which I find troubling.
>
> I think I will give up using local load/stores and will see if I can figure
> out if rewrite using MPI calls like MPI_Fetch_and_op  as you suggest.
> Thanks for your help.
>
> On Mon, Mar 6, 2017 at 7:20 PM, Jeff Hammond <jeff.science at gmail.com> wrote:
>
>> What processor architecture are you testing?
>>
>> Maybe set lmem to volatile or read it with MPI_Fetch_and_op rather than a
>> load.  MPI_Win_sync cannot prevent the compiler from caching *lmem in a
>> register.
>>
>> Jeff
>>
>> On Sat, Mar 4, 2017 at 12:30 AM, Ask Jakobsen <afj at qeye-labs.com> wrote:
>>
>>> Hi,
>>>
>>> I have downloaded the source code for the MCS lock from the excellent
>>> book "Using Advanced MPI" from http://www.mcs.anl.gov/researc
>>> h/projects/mpi/usingmpi/examples-advmpi/rma2/mcs-lock.c
>>>
>>> I have made a very simple piece of test code for testing the MCS lock but
>>> it works at random and often never escapes the busy loops in the acquire
>>> and release functions (see attached source code). The code appears
>>> semantically correct to my eyes.
>>>
>>> #include <stdio.h>
>>> #include <mpi.h>
>>> #include "mcs-lock.h"
>>>
>>> int main(int argc, char *argv[])
>>> {
>>>   MPI_Win win;
>>>   MPI_Init( &argc, &argv );
>>>
>>>   MCSLockInit(MPI_COMM_WORLD, &win);
>>>
>>>   int rank, size;
>>>   MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>>   MPI_Comm_size(MPI_COMM_WORLD, &size);
>>>
>>>   printf("rank: %d, size: %d\n", rank, size);
>>>
>>>
>>>   MCSLockAcquire(win);
>>>   printf("rank %d aquired lock\n", rank);   fflush(stdout);
>>>   MCSLockRelease(win);
>>>
>>>
>>>   MPI_Win_free(&win);
>>>   MPI_Finalize();
>>>   return 0;
>>> }
>>>
>>>
>>> I have tested on several hardware platforms and mpich-3.2 and mpich-3.3a2
>>> but with no luck.
>>>
>>> It appears that the MPI_Win_Sync are not "refreshing" the local data or I
>>> have a bug I can't spot.
>>>
>>> A simple unfair lock like http://www.mcs.anl.gov/researc
>>> h/projects/mpi/usingmpi/examples-advmpi/rma2/ga_mutex1.c works perfectly.
>>>
>>> Best regards, Ask Jakobsen
>>>
>>>
>>> _______________________________________________
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>
>>
>>
>> --
>> Jeff Hammond
>> jeff.science at gmail.com
>> http://jeffhammond.github.io/
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>
>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list