[mpich-commits] [mpich] MPICH primary repository branch, master, updated. v3.2b3-277-g9acdb05

Service Account noreply at mpich.org
Wed Jul 22 12:46:31 CDT 2015


This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "MPICH primary repository".

The branch, master has been updated
       via  9acdb05448288f3bc26559edec91b5904746b8b1 (commit)
       via  c267eb4cb975a9c6d2ad4ecd4dfe002870cff394 (commit)
       via  03e2d6431764a7b25e4971b4040a3a15d424889b (commit)
      from  970a6f4d051372b5af02cff890f41c9539c627e3 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
http://git.mpich.org/mpich.git/commitdiff/9acdb05448288f3bc26559edec91b5904746b8b1

commit 9acdb05448288f3bc26559edec91b5904746b8b1
Author: Lena Oden <loden at anl.gov>
Date:   Wed Jul 22 12:31:23 2015 -0500

    Remove xfail for test/mpi/comm/comm_idup_comm
    
    Remove xfail because it was fixed with [03e2d643].
    
    No reviewer

diff --git a/test/mpi/comm/testlist b/test/mpi/comm/testlist
index 562a193..31140cc 100644
--- a/test/mpi/comm/testlist
+++ b/test/mpi/comm/testlist
@@ -32,7 +32,7 @@ comm_idup_overlap 2 mpiversion=3.0
 comm_idup_iallreduce 6 mpiversion=3.0
 comm_idup_nb 6 mpiversion=3.0
 comm_idup_isend 6 mpiversion=3.0
-comm_idup_comm 6 mpiversion=3.0 xfail=ticket2286
+comm_idup_comm 6 mpiversion=3.0
 dup_with_info 2 mpiversion=3.0
 dup_with_info 4 mpiversion=3.0
 dup_with_info 9 mpiversion=3.0

http://git.mpich.org/mpich.git/commitdiff/c267eb4cb975a9c6d2ad4ecd4dfe002870cff394

commit c267eb4cb975a9c6d2ad4ecd4dfe002870cff394
Author: Lena Oden <loden at anl.gov>
Date:   Tue Jul 21 17:31:36 2015 -0500

    Add initialization for own_mask.
    
    The variable own_mask was not initialized and in some cases will have a
    wrong value. This leads to a failure that two threads think they own
    the same mask at the same time.
    
    Refs #2283
    
    Signed-off-by: Huiwei Lu <huiweilu at mcs.anl.gov>

diff --git a/src/mpi/comm/commutil.c b/src/mpi/comm/commutil.c
index 242bcca..978e936 100644
--- a/src/mpi/comm/commutil.c
+++ b/src/mpi/comm/commutil.c
@@ -1612,7 +1612,7 @@ static int sched_get_cid_nonblock(MPID_Comm *comm_ptr, MPIR_Context_id_t *ctx0,
      * idup_curr_seqnum gives each duplication operation a priority */
      st->comm_ptr->idup_count++;
      st->seqnum = st->comm_ptr->idup_curr_seqnum++;
-
+     st->own_mask = 0;
     if (eager_nelem < 0) {
         /* Ensure that at least one word of deadlock-free context IDs is
            always set aside for the base protocol */

http://git.mpich.org/mpich.git/commitdiff/03e2d6431764a7b25e4971b4040a3a15d424889b

commit 03e2d6431764a7b25e4971b4040a3a15d424889b
Author: Lena Oden <loden at anl.gov>
Date:   Mon Jul 20 14:37:52 2015 -0500

    Fixes test/mpi/comm/comm_idup_comm
    
    This patch fixes the mixing use of MPI_Comm_idup and blocking
    communicator creation-functions when they are all using the same parent
    communicator.  We will need a way to correctly order these duplication
    operations. In [2b219dfe4ca8], a counter was added to the parent
    communicator to correctly order multiple MPI_Comm_idup's. We need to do
    the same between MPI_Comm_idup and MPI_Comm_dup. Otherwise,
    test/mpi/comm/comm_idup_comm will run into deadlock because MPI_Comm_dup
    may break the order of those MPI_Comm_idup’s.
    
    Refs #2286
    
    Signed-off-by: Huiwei Lu <huiweilu at mcs.anl.gov>

diff --git a/src/mpi/comm/commutil.c b/src/mpi/comm/commutil.c
index 0cf32e5..242bcca 100644
--- a/src/mpi/comm/commutil.c
+++ b/src/mpi/comm/commutil.c
@@ -1104,6 +1104,7 @@ int MPIR_Get_contextid_sparse_group(MPID_Comm *comm_ptr, MPID_Group *group_ptr,
     int own_eager_mask = 0;
     mpir_errflag_t errflag = MPIR_ERR_NONE;
     int first_iter = 1;
+    int seqnum;
     MPID_MPI_STATE_DECL(MPID_STATE_MPIR_GET_CONTEXTID);
 
     MPID_MPI_FUNC_ENTER(MPID_STATE_MPIR_GET_CONTEXTID);
@@ -1149,6 +1150,9 @@ int MPIR_Get_contextid_sparse_group(MPID_Comm *comm_ptr, MPID_Group *group_ptr,
         else if (first_iter) {
             memset(local_mask, 0, MPIR_MAX_CONTEXT_MASK * sizeof(int));
             own_eager_mask = 0;
+            if(comm_ptr->idup_count)
+                 seqnum = comm_ptr->idup_curr_seqnum++;
+
 
             /* Attempt to reserve the eager mask segment */
             if (!eager_in_use && eager_nelem > 0) {
@@ -1170,7 +1174,8 @@ int MPIR_Get_contextid_sparse_group(MPID_Comm *comm_ptr, MPID_Group *group_ptr,
                 lowestTag       = tag;
             }
 
-            if (mask_in_use || ! (comm_ptr->context_id == lowestContextId && tag == lowestTag)) {
+            if (mask_in_use || ! (comm_ptr->context_id == lowestContextId && tag == lowestTag) ||
+               (comm_ptr->idup_count && seqnum != comm_ptr->idup_next_seqnum))  {
                 memset(local_mask, 0, MPIR_MAX_CONTEXT_MASK * sizeof(int));
                 own_mask = 0;
                 MPIU_DBG_MSG_D(COMM, VERBOSE, "In in-use, set lowestContextId to %d", lowestContextId);
@@ -1259,6 +1264,7 @@ int MPIR_Get_contextid_sparse_group(MPID_Comm *comm_ptr, MPID_Group *group_ptr,
                     lowestTag       = -1;
                     /* Else leave it alone; there is another thread waiting */
                 }
+                comm_ptr->idup_curr_seqnum++;
             }
             else {
                 /* else we did not find a context id. Give up the mask in case
@@ -1490,11 +1496,6 @@ static int sched_cb_gcn_copy_mask(MPID_Comm *comm, int tag, void *state)
         }
         st->first_iter = 0;
 
-        /* idup_count > 1 means there are multiple communicators duplicating
-         * from the current communicator at the same time. And
-         * idup_curr_seqnum gives each duplication operation a priority */
-        st->comm_ptr->idup_count++;
-        st->seqnum = st->comm_ptr->idup_curr_seqnum++;
     } else {
         if (st->comm_ptr->context_id < lowestContextId) {
             lowestContextId = st->comm_ptr->context_id;
@@ -1606,6 +1607,12 @@ static int sched_get_cid_nonblock(MPID_Comm *comm_ptr, MPIR_Context_id_t *ctx0,
     *(st->ctx0) = 0;
     st->own_eager_mask = 0;
     st->first_iter = 1;
+    /* idup_count > 1 means there are multiple communicators duplicating
+     * from the current communicator at the same time. And
+     * idup_curr_seqnum gives each duplication operation a priority */
+     st->comm_ptr->idup_count++;
+     st->seqnum = st->comm_ptr->idup_curr_seqnum++;
+
     if (eager_nelem < 0) {
         /* Ensure that at least one word of deadlock-free context IDs is
            always set aside for the base protocol */

-----------------------------------------------------------------------

Summary of changes:
 src/mpi/comm/commutil.c |   19 +++++++++++++------
 test/mpi/comm/testlist  |    2 +-
 2 files changed, 14 insertions(+), 7 deletions(-)


hooks/post-receive
-- 
MPICH primary repository


More information about the commits mailing list