[mpich-commits] [mpich] MPICH primary repository branch, master, updated. v3.2b1-77-g73e3211

Service Account noreply at mpich.org
Fri Apr 17 12:28:31 CDT 2015


This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "MPICH primary repository".

The branch, master has been updated
       via  73e3211228c525b08221fbfcb161f88878f8cd24 (commit)
      from  4309ba574293e7bdf2ec03224e4afa948a1e6fd9 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
http://git.mpich.org/mpich.git/commitdiff/73e3211228c525b08221fbfcb161f88878f8cd24

commit 73e3211228c525b08221fbfcb161f88878f8cd24
Author: Ken Raffenetti <raffenet at mcs.anl.gov>
Date:   Thu Apr 16 14:10:40 2015 -0500

    portals4: fix anysource_matched
    
    This fix, along with a pending patch to the Portal4 reference implementation,
    should make anysource_matched a more reliable operation for multithreaded apps. We
    were seeing a race condition where an ME would unlink successfully, but an
    event matching it would still arrive in the queue. CH3 can now reliably search
    the netmod queue for matched MPI_ANY_SOURCE requests.
    
    The reason that we no longer assert that an MPI_ANY_SOURCE request was removed
    from the CH3 queue is that FDP (find and dequeue posted) operations will remove
    the request from the queue, if it is known to be already matched by the netmod,
    even if it has not yet completed.
    
    Fixes #2199
    
    Signed-off-by: Antonio J. Pena <apenya at mcs.anl.gov>

diff --git a/src/mpid/ch3/channels/nemesis/netmod/portals4/ptl_recv.c b/src/mpid/ch3/channels/nemesis/netmod/portals4/ptl_recv.c
index ec6d90a..f732751 100644
--- a/src/mpid/ch3/channels/nemesis/netmod/portals4/ptl_recv.c
+++ b/src/mpid/ch3/channels/nemesis/netmod/portals4/ptl_recv.c
@@ -22,7 +22,10 @@ static void dequeue_req(const ptl_event_t *e)
     REQ_PTL(rreq)->put_me = PTL_INVALID_HANDLE;
 
     found = MPIDI_CH3U_Recvq_DP(rreq);
-    MPIU_Assert(found);
+    /* an MPI_ANY_SOURCE request may have been previously removed from the
+       CH3 queue by an FDP (find and dequeue posted) operation */
+    if (rreq->dev.match.parts.rank != MPI_ANY_SOURCE)
+        MPIU_Assert(found);
 
     rreq->status.MPI_ERROR = MPI_SUCCESS;
     rreq->status.MPI_SOURCE = NPTL_MATCH_GET_RANK(e->match_bits);
@@ -597,15 +600,12 @@ static int cancel_recv(MPID_Request *rreq, int *cancelled)
     /* An invalid handle indicates the operation has been completed
        and the matching list entry unlinked. At that point, the operation
        cannot be cancelled. */
-    if (REQ_PTL(rreq)->put_me != PTL_INVALID_HANDLE) {
-        ptl_err = PtlMEUnlink(REQ_PTL(rreq)->put_me);
-        if (ptl_err == PTL_OK)
-            *cancelled = TRUE;
-        /* FIXME: if we properly invalidate matching list entry handles, we should be
-           able to ensure an unlink operation results in either PTL_OK or PTL_IN_USE.
-           Anything else would be an error. For now, though, we assume anything but PTL_OK
-           is uncancelable and return. */
-    }
+    if (REQ_PTL(rreq)->put_me == PTL_INVALID_HANDLE)
+        goto fn_exit;
+
+    ptl_err = PtlMEUnlink(REQ_PTL(rreq)->put_me);
+    if (ptl_err == PTL_OK)
+        *cancelled = TRUE;
 
  fn_exit:
     MPIDI_FUNC_EXIT(MPID_STATE_CANCEL_RECV);
@@ -635,7 +635,7 @@ int MPID_nem_ptl_anysource_matched(MPID_Request *rreq)
 
  fn_exit:
     MPIDI_FUNC_EXIT(MPID_STATE_MPID_NEM_PTL_ANYSOURCE_MATCHED);
-    return MPI_SUCCESS;
+    return !cancelled;
  fn_fail:
     goto fn_exit;
 }
diff --git a/test/mpi/threads/pt2pt/testlist b/test/mpi/threads/pt2pt/testlist
index 08e25a4..b6ab43e 100644
--- a/test/mpi/threads/pt2pt/testlist
+++ b/test/mpi/threads/pt2pt/testlist
@@ -1,6 +1,6 @@
 threads 2 timeLimit=600
 threaded_sr 2
-alltoall 4 xfail=ticket2199
+alltoall 4
 sendselfth 1
 multisend 2
 multisend2 5

-----------------------------------------------------------------------

Summary of changes:
 .../channels/nemesis/netmod/portals4/ptl_recv.c    |   22 ++++++++++----------
 test/mpi/threads/pt2pt/testlist                    |    2 +-
 2 files changed, 12 insertions(+), 12 deletions(-)


hooks/post-receive
-- 
MPICH primary repository


More information about the commits mailing list