[mpich-devel] Incorrect semantics in test/mpi/rma/reqops.c
Nathan Hjelm
hjelmn at lanl.gov
Tue Feb 25 17:03:12 CST 2014
Hello, I am finishing MPI-3 RMA support in Open MPI and I am using the
MPICH test suite to test the new functionality before release. I think
I found a bug in one of your tests. The reqops test does the following
in one of its loops:
for (i = 0; i < ITER; i++) {
MPI_Request req;
int val = -1, exp = -1;
/* Processes form a ring. Process 0 starts first, then passes a token
* to the right. Each process, in turn, performs third-party
* communication via process 0's window. */
if (rank > 0) {
MPI_Recv(NULL, 0, MPI_BYTE, rank-1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
}
MPI_Rget(&val, 1, MPI_INT, 0, 0, 1, MPI_INT, window, &req);
assert(req != MPI_REQUEST_NULL);
MPI_Wait(&req, MPI_STATUS_IGNORE);
MPI_Rput(&rank, 1, MPI_INT, 0, 0, 1, MPI_INT, window, &req);
assert(req != MPI_REQUEST_NULL);
MPI_Wait(&req, MPI_STATUS_IGNORE);
exp = (rank + nproc-1) % nproc;
if (val != exp) {
printf("%d - Got %d, expected %d\n", rank, val, exp);
errors++;
}
if (rank < nproc-1) {
MPI_Send(NULL, 0, MPI_BYTE, rank+1, 0, MPI_COMM_WORLD);
}
MPI_Barrier(MPI_COMM_WORLD);
}
The problem is that no call is being made to ensure the RMA operation is
complete at the target as per MPI-3 p432 13-17:
"The completion of an MPI_RPUT operation (i.e., after the corresponding
test or wait) indicates that the sender is now free to update the
locations in the origin buffer. It does not indicate that the data is
available at the target window. If remote completion is required,
MPI_WIN_FLUSH, MPI_WIN_FLUSH_ALL, MPI_WIN_UNLOCK, or MPI_WIN_UNLOCK_ALL
can be used."
Not ensuring remote completion may cause processes further down the ring
to read a value written by a process other than the previous process in
the ring.
The fix is to add MPI_Win_flush (0, window) before the MPI_Send. I can
confirm that this fixes the issue with Open MPI. Please see the attached
patch.
-Nathan Hjelm
HPC-5, LANL
-------------- next part --------------
diff --git a/test/mpi/rma/reqops.c b/test/mpi/rma/reqops.c
index ef2636f..f4509ca 100644
--- a/test/mpi/rma/reqops.c
+++ b/test/mpi/rma/reqops.c
@@ -86,6 +86,7 @@ int main( int argc, char *argv[] )
errors++;
}
+ MPI_Win_flush(0, window);
if (rank < nproc-1) {
MPI_Send(NULL, 0, MPI_BYTE, rank+1, 0, MPI_COMM_WORLD);
}
@@ -125,6 +126,7 @@ int main( int argc, char *argv[] )
errors++;
}
+ MPI_Win_flush(0, window);
if (rank < nproc-1) {
MPI_Send(NULL, 0, MPI_BYTE, rank+1, 0, MPI_COMM_WORLD);
}
@@ -164,6 +166,7 @@ int main( int argc, char *argv[] )
errors++;
}
+ MPI_Win_flush(0, window);
if (rank < nproc-1) {
MPI_Send(NULL, 0, MPI_BYTE, rank+1, 0, MPI_COMM_WORLD);
}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/devel/attachments/20140225/839d7793/attachment-0001.pgp>
More information about the devel
mailing list