[mpich-discuss] RMA and multithreading (MPI_THREAD_MULTIPLE)

Jeff Hammond jeff.science at gmail.com
Thu Jun 6 15:18:29 CDT 2019


Your threads are trying to lock a window that is already locked or unlock a
window that is not locked.  That's because you are using one window with
multiple threads.

If you want to use multiple threads like this, do one of two things:
1) use a window per thread
2) synchronize with flush(_all).  only lock and unlock the window once
(immediately after construction and immediately before destruction,
respectively)

Jeff

On Thu, Jun 6, 2019 at 12:30 PM Alexey Paznikov via discuss <
discuss at mpich.org> wrote:

> Hi,
>
> I have multithreaded MPI program (MPI_THREAD_MULTIPLE mode) with RMA
> calls.This is a simplified example:
>
> #include <stdio.h>
> #include <mpi.h>
> #include <pthread.h>
>
> MPI_Win win;
> pthread_mutex_t lock;
>
> void *thread(void *arg)
> {
>     MPI_Win_lock_all(0, win);
>     MPI_Win_unlock_all(win);
>     return NULL;
> }
>
> int main(int argc, char *argv[])
> {
>     MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, NULL);
>     pthread_mutex_init(&lock, NULL);
>
>     int *buf = NULL;
>     const int bufsize = 1;
>     MPI_Alloc_mem(bufsize, MPI_INFO_NULL, &buf);
>     MPI_Win_create(buf, 1, 1, MPI_INFO_NULL, MPI_COMM_WORLD, &win);
>
>     pthread_t tid;
>     pthread_create(&tid, NULL, thread, NULL);
>
>     MPI_Win_lock_all(0, win);
>     MPI_Win_unlock_all(win);
>
>     pthread_join(tid, NULL);
>
>     MPI_Win_free(&win);
>
>     MPI_Finalize();
>
>     return 0;
> }
>
> If I run such program, it crashes with an error message:
>
> Fatal error in PMPI_Win_lock_all: Wrong synchronization of RMA calls ,
> error stack:
> PMPI_Win_lock_all(149).: MPI_Win_lock_all(assert=0, win=0xa0000000) failed
> MPID_Win_lock_all(1522): Wrong synchronization of RMA calls
>
> If I replace MPI_Win_lock_all with MPI_Win_lock the problem remains:
>
> Fatal error in PMPI_Win_lock: Wrong synchronization of RMA calls , error
> stack:
> PMPI_Win_lock(157).: MPI_Win_lock(lock_type=234, rank=0, assert=0,
> win=0xa0000000) failed
> MPID_Win_lock(1163): Wrong synchronization of RMA calls
>
> (this message is repeated many times)
>
> If I protect RMA operations with a mutex, the problem disappears.
>
> MPICH version 3.3. Similar problem is also in MVAPICH2 2.3.1.
>
> Are the RMA operations not thread safe at the moment? Could you tell me
> how to deal with this problem?
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>


-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20190606/2c7a94ba/attachment.html>


More information about the discuss mailing list