[mpich-discuss] Dataloop error message

Halim Amer aamer at anl.gov
Tue Mar 7 13:35:25 CST 2017


Can you attach a simple reproducible code?

Halim
www.mcs.anl.gov/~aamer

On 3/7/17 1:31 PM, Palmer, Bruce J wrote:
> Hi,
>
> I'm trying to track down a possible race condition in a test program that is using MPI RMA from MPICH 3.2. The program repeats a series of put/get/accumulate operations to different processors. When I'm running on 1 node 4 processors everything is fine but when I move to 2  nodes 4 processors I start getting failures. The error messages I'm seeing are
>
> Assertion failed in file src/mpid/common/datatype/dataloop/dataloop.c at line 265: 0
>
> and
>
> Assertion failed in file src/mpid/common/datatype/dataloop/dataloop.c at line 157: dataloop->loop_params.cm_t.dataloop
>
> Does anyone have a handle on what these routines do and what kind of behavior is generating these errors? The test program is allocating memory and using it to create a window, followed immediately by a call to MPI_Win_lock_all to create a passive synchronization epoch. I've been using request based RMA calls (Rput, Rget, Raccumulate) followed by an immediate call to MPI_Wait  for the individual RMA operations. Any suggestions about what these errors are telling me? If I start putting in print statements to narrow down the location of the error, the code runs to completion.
>
> Bruce Palmer
>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list