[mpich-devel] Incorrect semantics in test/mpi/rma/reqops.c

Jim Dinan james.dinan at gmail.com
Tue Feb 25 20:20:10 CST 2014


Yep, that's a bug.  Patch looks like the right fix to me.

 ~Jim.


On Tue, Feb 25, 2014 at 6:03 PM, Nathan Hjelm <hjelmn at lanl.gov> wrote:

> Hello, I am finishing MPI-3 RMA support in Open MPI and I am using the
> MPICH test suite to test the new functionality before release. I think
> I found a bug in one of your tests. The reqops test does the following
> in one of its loops:
>
>     for (i = 0; i < ITER; i++) {
>         MPI_Request req;
>         int val = -1, exp = -1;
>
>         /* Processes form a ring.  Process 0 starts first, then passes a
> token
>          * to the right.  Each process, in turn, performs third-party
>          * communication via process 0's window. */
>         if (rank > 0) {
>             MPI_Recv(NULL, 0, MPI_BYTE, rank-1, 0, MPI_COMM_WORLD,
> MPI_STATUS_IGNORE);
>         }
>
>         MPI_Rget(&val, 1, MPI_INT, 0, 0, 1, MPI_INT, window, &req);
>         assert(req != MPI_REQUEST_NULL);
>         MPI_Wait(&req, MPI_STATUS_IGNORE);
>
>         MPI_Rput(&rank, 1, MPI_INT, 0, 0, 1, MPI_INT, window, &req);
>         assert(req != MPI_REQUEST_NULL);
>         MPI_Wait(&req, MPI_STATUS_IGNORE);
>
>         exp = (rank + nproc-1) % nproc;
>
>         if (val != exp) {
>             printf("%d - Got %d, expected %d\n", rank, val, exp);
>             errors++;
>         }
>
>         if (rank < nproc-1) {
>             MPI_Send(NULL, 0, MPI_BYTE, rank+1, 0, MPI_COMM_WORLD);
>         }
>
>         MPI_Barrier(MPI_COMM_WORLD);
>     }
>
>
> The problem is that no call is being made to ensure the RMA operation is
> complete at the target as per MPI-3 p432 13-17:
>
> "The completion of an MPI_RPUT operation (i.e., after the corresponding
> test or wait) indicates that the sender is now free to update the
> locations in the origin buffer. It does not indicate that the data is
> available at the target window. If remote completion is required,
> MPI_WIN_FLUSH, MPI_WIN_FLUSH_ALL, MPI_WIN_UNLOCK, or MPI_WIN_UNLOCK_ALL
> can be used."
>
>
> Not ensuring remote completion may cause processes further down the ring
> to read a value written by a process other than the previous process in
> the ring.
>
> The fix is to add MPI_Win_flush (0, window) before the MPI_Send. I can
> confirm that this fixes the issue with Open MPI. Please see the attached
> patch.
>
>
> -Nathan Hjelm
> HPC-5, LANL
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/devel/attachments/20140225/08fc9c49/attachment.html>


More information about the devel mailing list