[mpich-devel] Incorrect semantics in test/mpi/rma/reqops.c

Jeff Hammond jeff.science at gmail.com
Tue Feb 25 20:40:37 CST 2014


Hi Nathan,

I'd strongly encourage you to run ARMCI-MPI's test suite when you're
done, if for no other reason than I will do so the day you announce
OpenMPI supports MPI-3 RMA and blast you with bug reports if anything
fails :-D

See http://wiki.mpich.org/armci-mpi/index.php/Main_Page for details.
Note that you need to explicitly checkout the mpi3rma branch to touch
those features.  Feel free to annoy the heck out of me if there's
anything you don't like about ARMCI-MPI.  I certainly deserve intense
scrutiny if I'm going to threaten you with bug reports :-D

In the event that you're doing MPI-3 RMA in OpenMPI sans datatype
support (i.e. the same as OpenMPI's MPI-2 RMA), ARMCI-MPI can handle
this.  The relevant env vars to disable the datatype code path are
documented in an obviously-named file (probably README, but I am too
lazy to confirm it with the repo).

Finally, if you really want to watch the world burn, try running
NWChem with ARMCI-MPI using your MPI-3 RMA implementation.
http://wiki.mpich.org/armci-mpi/index.php/NWChem has build
instructions for Intel SDK (contact me for other cases if necessary).

Feel free to harass me at the Forum if necessary.

Best,

Jeff

On Tue, Feb 25, 2014 at 5:03 PM, Nathan Hjelm <hjelmn at lanl.gov> wrote:
> Hello, I am finishing MPI-3 RMA support in Open MPI and I am using the
> MPICH test suite to test the new functionality before release. I think
> I found a bug in one of your tests. The reqops test does the following
> in one of its loops:
>
>     for (i = 0; i < ITER; i++) {
>         MPI_Request req;
>         int val = -1, exp = -1;
>
>         /* Processes form a ring.  Process 0 starts first, then passes a token
>          * to the right.  Each process, in turn, performs third-party
>          * communication via process 0's window. */
>         if (rank > 0) {
>             MPI_Recv(NULL, 0, MPI_BYTE, rank-1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
>         }
>
>         MPI_Rget(&val, 1, MPI_INT, 0, 0, 1, MPI_INT, window, &req);
>         assert(req != MPI_REQUEST_NULL);
>         MPI_Wait(&req, MPI_STATUS_IGNORE);
>
>         MPI_Rput(&rank, 1, MPI_INT, 0, 0, 1, MPI_INT, window, &req);
>         assert(req != MPI_REQUEST_NULL);
>         MPI_Wait(&req, MPI_STATUS_IGNORE);
>
>         exp = (rank + nproc-1) % nproc;
>
>         if (val != exp) {
>             printf("%d - Got %d, expected %d\n", rank, val, exp);
>             errors++;
>         }
>
>         if (rank < nproc-1) {
>             MPI_Send(NULL, 0, MPI_BYTE, rank+1, 0, MPI_COMM_WORLD);
>         }
>
>         MPI_Barrier(MPI_COMM_WORLD);
>     }
>
>
> The problem is that no call is being made to ensure the RMA operation is
> complete at the target as per MPI-3 p432 13-17:
>
> "The completion of an MPI_RPUT operation (i.e., after the corresponding
> test or wait) indicates that the sender is now free to update the
> locations in the origin buffer. It does not indicate that the data is
> available at the target window. If remote completion is required,
> MPI_WIN_FLUSH, MPI_WIN_FLUSH_ALL, MPI_WIN_UNLOCK, or MPI_WIN_UNLOCK_ALL
> can be used."
>
>
> Not ensuring remote completion may cause processes further down the ring
> to read a value written by a process other than the previous process in
> the ring.
>
> The fix is to add MPI_Win_flush (0, window) before the MPI_Send. I can
> confirm that this fixes the issue with Open MPI. Please see the attached
> patch.
>
>
> -Nathan Hjelm
> HPC-5, LANL
>



-- 
Jeff Hammond
jeff.science at gmail.com


More information about the devel mailing list