[mpich-commits] [mpich] MPICH primary repository branch, master, updated. v3.2b3-277-g9acdb05
Service Account
noreply at mpich.org
Wed Jul 22 12:46:31 CDT 2015
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "MPICH primary repository".
The branch, master has been updated
via 9acdb05448288f3bc26559edec91b5904746b8b1 (commit)
via c267eb4cb975a9c6d2ad4ecd4dfe002870cff394 (commit)
via 03e2d6431764a7b25e4971b4040a3a15d424889b (commit)
from 970a6f4d051372b5af02cff890f41c9539c627e3 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
http://git.mpich.org/mpich.git/commitdiff/9acdb05448288f3bc26559edec91b5904746b8b1
commit 9acdb05448288f3bc26559edec91b5904746b8b1
Author: Lena Oden <loden at anl.gov>
Date: Wed Jul 22 12:31:23 2015 -0500
Remove xfail for test/mpi/comm/comm_idup_comm
Remove xfail because it was fixed with [03e2d643].
No reviewer
diff --git a/test/mpi/comm/testlist b/test/mpi/comm/testlist
index 562a193..31140cc 100644
--- a/test/mpi/comm/testlist
+++ b/test/mpi/comm/testlist
@@ -32,7 +32,7 @@ comm_idup_overlap 2 mpiversion=3.0
comm_idup_iallreduce 6 mpiversion=3.0
comm_idup_nb 6 mpiversion=3.0
comm_idup_isend 6 mpiversion=3.0
-comm_idup_comm 6 mpiversion=3.0 xfail=ticket2286
+comm_idup_comm 6 mpiversion=3.0
dup_with_info 2 mpiversion=3.0
dup_with_info 4 mpiversion=3.0
dup_with_info 9 mpiversion=3.0
http://git.mpich.org/mpich.git/commitdiff/c267eb4cb975a9c6d2ad4ecd4dfe002870cff394
commit c267eb4cb975a9c6d2ad4ecd4dfe002870cff394
Author: Lena Oden <loden at anl.gov>
Date: Tue Jul 21 17:31:36 2015 -0500
Add initialization for own_mask.
The variable own_mask was not initialized and in some cases will have a
wrong value. This leads to a failure that two threads think they own
the same mask at the same time.
Refs #2283
Signed-off-by: Huiwei Lu <huiweilu at mcs.anl.gov>
diff --git a/src/mpi/comm/commutil.c b/src/mpi/comm/commutil.c
index 242bcca..978e936 100644
--- a/src/mpi/comm/commutil.c
+++ b/src/mpi/comm/commutil.c
@@ -1612,7 +1612,7 @@ static int sched_get_cid_nonblock(MPID_Comm *comm_ptr, MPIR_Context_id_t *ctx0,
* idup_curr_seqnum gives each duplication operation a priority */
st->comm_ptr->idup_count++;
st->seqnum = st->comm_ptr->idup_curr_seqnum++;
-
+ st->own_mask = 0;
if (eager_nelem < 0) {
/* Ensure that at least one word of deadlock-free context IDs is
always set aside for the base protocol */
http://git.mpich.org/mpich.git/commitdiff/03e2d6431764a7b25e4971b4040a3a15d424889b
commit 03e2d6431764a7b25e4971b4040a3a15d424889b
Author: Lena Oden <loden at anl.gov>
Date: Mon Jul 20 14:37:52 2015 -0500
Fixes test/mpi/comm/comm_idup_comm
This patch fixes the mixing use of MPI_Comm_idup and blocking
communicator creation-functions when they are all using the same parent
communicator. We will need a way to correctly order these duplication
operations. In [2b219dfe4ca8], a counter was added to the parent
communicator to correctly order multiple MPI_Comm_idup's. We need to do
the same between MPI_Comm_idup and MPI_Comm_dup. Otherwise,
test/mpi/comm/comm_idup_comm will run into deadlock because MPI_Comm_dup
may break the order of those MPI_Comm_idup’s.
Refs #2286
Signed-off-by: Huiwei Lu <huiweilu at mcs.anl.gov>
diff --git a/src/mpi/comm/commutil.c b/src/mpi/comm/commutil.c
index 0cf32e5..242bcca 100644
--- a/src/mpi/comm/commutil.c
+++ b/src/mpi/comm/commutil.c
@@ -1104,6 +1104,7 @@ int MPIR_Get_contextid_sparse_group(MPID_Comm *comm_ptr, MPID_Group *group_ptr,
int own_eager_mask = 0;
mpir_errflag_t errflag = MPIR_ERR_NONE;
int first_iter = 1;
+ int seqnum;
MPID_MPI_STATE_DECL(MPID_STATE_MPIR_GET_CONTEXTID);
MPID_MPI_FUNC_ENTER(MPID_STATE_MPIR_GET_CONTEXTID);
@@ -1149,6 +1150,9 @@ int MPIR_Get_contextid_sparse_group(MPID_Comm *comm_ptr, MPID_Group *group_ptr,
else if (first_iter) {
memset(local_mask, 0, MPIR_MAX_CONTEXT_MASK * sizeof(int));
own_eager_mask = 0;
+ if(comm_ptr->idup_count)
+ seqnum = comm_ptr->idup_curr_seqnum++;
+
/* Attempt to reserve the eager mask segment */
if (!eager_in_use && eager_nelem > 0) {
@@ -1170,7 +1174,8 @@ int MPIR_Get_contextid_sparse_group(MPID_Comm *comm_ptr, MPID_Group *group_ptr,
lowestTag = tag;
}
- if (mask_in_use || ! (comm_ptr->context_id == lowestContextId && tag == lowestTag)) {
+ if (mask_in_use || ! (comm_ptr->context_id == lowestContextId && tag == lowestTag) ||
+ (comm_ptr->idup_count && seqnum != comm_ptr->idup_next_seqnum)) {
memset(local_mask, 0, MPIR_MAX_CONTEXT_MASK * sizeof(int));
own_mask = 0;
MPIU_DBG_MSG_D(COMM, VERBOSE, "In in-use, set lowestContextId to %d", lowestContextId);
@@ -1259,6 +1264,7 @@ int MPIR_Get_contextid_sparse_group(MPID_Comm *comm_ptr, MPID_Group *group_ptr,
lowestTag = -1;
/* Else leave it alone; there is another thread waiting */
}
+ comm_ptr->idup_curr_seqnum++;
}
else {
/* else we did not find a context id. Give up the mask in case
@@ -1490,11 +1496,6 @@ static int sched_cb_gcn_copy_mask(MPID_Comm *comm, int tag, void *state)
}
st->first_iter = 0;
- /* idup_count > 1 means there are multiple communicators duplicating
- * from the current communicator at the same time. And
- * idup_curr_seqnum gives each duplication operation a priority */
- st->comm_ptr->idup_count++;
- st->seqnum = st->comm_ptr->idup_curr_seqnum++;
} else {
if (st->comm_ptr->context_id < lowestContextId) {
lowestContextId = st->comm_ptr->context_id;
@@ -1606,6 +1607,12 @@ static int sched_get_cid_nonblock(MPID_Comm *comm_ptr, MPIR_Context_id_t *ctx0,
*(st->ctx0) = 0;
st->own_eager_mask = 0;
st->first_iter = 1;
+ /* idup_count > 1 means there are multiple communicators duplicating
+ * from the current communicator at the same time. And
+ * idup_curr_seqnum gives each duplication operation a priority */
+ st->comm_ptr->idup_count++;
+ st->seqnum = st->comm_ptr->idup_curr_seqnum++;
+
if (eager_nelem < 0) {
/* Ensure that at least one word of deadlock-free context IDs is
always set aside for the base protocol */
-----------------------------------------------------------------------
Summary of changes:
src/mpi/comm/commutil.c | 19 +++++++++++++------
test/mpi/comm/testlist | 2 +-
2 files changed, 14 insertions(+), 7 deletions(-)
hooks/post-receive
--
MPICH primary repository
More information about the commits
mailing list