[mpich-discuss] MCS lock and MPI RMA problem
Halim Amer
aamer at anl.gov
Tue Mar 7 10:29:44 CST 2017
In the Release protocol, try removing this test:
if (lmem[nextRank] == -1) {
If-Block;
}
but keep the If-Block.
The hang occurs because the process releasing the MCS lock fails to
detect that another process is being or already enqueued in the MCS queue.
Halim
www.mcs.anl.gov/~aamer
On 3/7/17 6:43 AM, Ask Jakobsen wrote:
> Thanks, Halim. I have now enabled asynchronous progress in MPICH (can't
> find something similar in openmpi) and now all ranks acquire the lock and
> the program finish as expected. However if I put a while(1) loop around the
> acquire-release code in main.c it will fail again at random and go into an
> infinite loop. The simple unfair lock does not have this problem.
>
> On Tue, Mar 7, 2017 at 12:44 AM, Halim Amer <aamer at anl.gov> wrote:
>
>> My understanding is that this code assumes asynchronous progress.
>> An example of when the processes hang is as follows:
>>
>> 1) P0 Finishes MCSLockAcquire()
>> 2) P1 is busy waiting in MCSLockAcquire() at
>> do {
>> MPI_Win_sync(win);
>> } while (lmem[blocked] == 1);
>> 3) P0 executes MCSLockRelease()
>> 4) P0 waits on MPI_Win_lock_all() inside MCSLockRlease()
>>
>> Hang!
>>
>> For P1 to get out of the loop, P0 has to get out of MPI_Win_lock_all() and
>> executes its Compare_and_swap().
>>
>> For P0 to get out MPI_Win_lock_all(), it needs an ACK from P1 that it got
>> the lock.
>>
>> P1 does not make communication progress because MPI_Win_sync is not
>> required to do so. It only synchronizes private and public copies.
>>
>> For this hang to disappear, one can either trigger progress manually by
>> using heavy-duty synchronization calls instead of Win_sync (e.g.,
>> Win_unlock_all + Win_lock_all), or enable asynchronous progress.
>>
>> To enable asynchronous progress in MPICH, set the MPIR_CVAR_ASYNC_PROGRESS
>> env var to 1.
>>
>> Halim
>> www.mcs.anl.gov/~aamer <http://www.mcs.anl.gov/%7Eaamer>
>>
>>
>> On 3/6/17 1:11 PM, Ask Jakobsen wrote:
>>
>>> I am testing on x86_64 platform.
>>>
>>> I have tried to built both the mpich and the mcs lock code with -O0 to
>>> avoid agressive optimization. After your suggestion I have also tried to
>>> make volatile int *pblocked pointing to lmem[blocked] in the
>>> MCSLockAcquire
>>> function and volatile int *pnextrank pointing to lmem[nextRank] in
>>> MCSLockRelease, but it does not appear to make a difference.
>>>
>>> On suggestion from Richard Warren I have also tried building the code
>>> using
>>> openmpi-2.0.2 without any luck (however it appears to acquire the lock a
>>> couple of extra times before failing) which I find troubling.
>>>
>>> I think I will give up using local load/stores and will see if I can
>>> figure
>>> out if rewrite using MPI calls like MPI_Fetch_and_op as you suggest.
>>> Thanks for your help.
>>>
>>> On Mon, Mar 6, 2017 at 7:20 PM, Jeff Hammond <jeff.science at gmail.com>
>>> wrote:
>>>
>>> What processor architecture are you testing?
>>>>
>>>> Maybe set lmem to volatile or read it with MPI_Fetch_and_op rather than a
>>>> load. MPI_Win_sync cannot prevent the compiler from caching *lmem in a
>>>> register.
>>>>
>>>> Jeff
>>>>
>>>> On Sat, Mar 4, 2017 at 12:30 AM, Ask Jakobsen <afj at qeye-labs.com> wrote:
>>>>
>>>> Hi,
>>>>>
>>>>> I have downloaded the source code for the MCS lock from the excellent
>>>>> book "Using Advanced MPI" from http://www.mcs.anl.gov/researc
>>>>> h/projects/mpi/usingmpi/examples-advmpi/rma2/mcs-lock.c
>>>>>
>>>>> I have made a very simple piece of test code for testing the MCS lock
>>>>> but
>>>>> it works at random and often never escapes the busy loops in the acquire
>>>>> and release functions (see attached source code). The code appears
>>>>> semantically correct to my eyes.
>>>>>
>>>>> #include <stdio.h>
>>>>> #include <mpi.h>
>>>>> #include "mcs-lock.h"
>>>>>
>>>>> int main(int argc, char *argv[])
>>>>> {
>>>>> MPI_Win win;
>>>>> MPI_Init( &argc, &argv );
>>>>>
>>>>> MCSLockInit(MPI_COMM_WORLD, &win);
>>>>>
>>>>> int rank, size;
>>>>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>>>> MPI_Comm_size(MPI_COMM_WORLD, &size);
>>>>>
>>>>> printf("rank: %d, size: %d\n", rank, size);
>>>>>
>>>>>
>>>>> MCSLockAcquire(win);
>>>>> printf("rank %d aquired lock\n", rank); fflush(stdout);
>>>>> MCSLockRelease(win);
>>>>>
>>>>>
>>>>> MPI_Win_free(&win);
>>>>> MPI_Finalize();
>>>>> return 0;
>>>>> }
>>>>>
>>>>>
>>>>> I have tested on several hardware platforms and mpich-3.2 and
>>>>> mpich-3.3a2
>>>>> but with no luck.
>>>>>
>>>>> It appears that the MPI_Win_Sync are not "refreshing" the local data or
>>>>> I
>>>>> have a bug I can't spot.
>>>>>
>>>>> A simple unfair lock like http://www.mcs.anl.gov/researc
>>>>> h/projects/mpi/usingmpi/examples-advmpi/rma2/ga_mutex1.c works
>>>>> perfectly.
>>>>>
>>>>> Best regards, Ask Jakobsen
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> discuss mailing list discuss at mpich.org
>>>>> To manage subscription options or unsubscribe:
>>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Jeff Hammond
>>>> jeff.science at gmail.com
>>>> http://jeffhammond.github.io/
>>>>
>>>> _______________________________________________
>>>> discuss mailing list discuss at mpich.org
>>>> To manage subscription options or unsubscribe:
>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>
>>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> discuss mailing list discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list