[mpich-discuss] Dataloop error message

Palmer, Bruce J Bruce.Palmer at pnnl.gov
Tue Mar 7 13:31:26 CST 2017


Hi,

I'm trying to track down a possible race condition in a test program that is using MPI RMA from MPICH 3.2. The program repeats a series of put/get/accumulate operations to different processors. When I'm running on 1 node 4 processors everything is fine but when I move to 2  nodes 4 processors I start getting failures. The error messages I'm seeing are

Assertion failed in file src/mpid/common/datatype/dataloop/dataloop.c at line 265: 0

and

Assertion failed in file src/mpid/common/datatype/dataloop/dataloop.c at line 157: dataloop->loop_params.cm_t.dataloop

Does anyone have a handle on what these routines do and what kind of behavior is generating these errors? The test program is allocating memory and using it to create a window, followed immediately by a call to MPI_Win_lock_all to create a passive synchronization epoch. I've been using request based RMA calls (Rput, Rget, Raccumulate) followed by an immediate call to MPI_Wait  for the individual RMA operations. Any suggestions about what these errors are telling me? If I start putting in print statements to narrow down the location of the error, the code runs to completion.

Bruce Palmer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20170307/4175897b/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list