[mpich-commits] [mpich] MPICH primary repository branch, master, updated. v3.1rc2-13-ga3e8305
mysql vizuser
noreply at mpich.org
Wed Nov 27 14:26:59 CST 2013
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "MPICH primary repository".
The branch, master has been updated
via a3e830570a6e41dc9d49e2139fa33fef604afc97 (commit)
via 86adc1b1f98641139c7e0e5b2387d27db0cd88a9 (commit)
from 0dcb61440ebcec65bc460c22c986b8d26e52a7ce (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
http://git.mpich.org/mpich.git/commitdiff/a3e830570a6e41dc9d49e2139fa33fef604afc97
commit a3e830570a6e41dc9d49e2139fa33fef604afc97
Author: James Dinan <james.dinan at intel.com>
Date: Wed Nov 20 12:53:22 2013 -0700
Update comm_create to use sparse ctx id allocation
Update MPI_Comm_create to use sparse, rather than dense, context_id
allocation. This fixes incorrect context ID exhaustion errors caused by
including processes that are not in the group of the new communicator in
the allocation operation. This bug is exercised by
test/mpi/cerrors/comm/too_many_comms3.c.
Signed-off-by: Ken Raffenetti <raffenet at mcs.anl.gov>
diff --git a/src/mpi/comm/comm_create.c b/src/mpi/comm/comm_create.c
index abf01ea..375f5a3 100644
--- a/src/mpi/comm/comm_create.c
+++ b/src/mpi/comm/comm_create.c
@@ -233,8 +233,8 @@ int MPIR_Comm_create_intra(MPID_Comm *comm_ptr, MPID_Group *group_ptr,
member of the group */
/* In the multi-threaded case, MPIR_Get_contextid assumes that the
calling routine already holds the single criticial section */
- /* TODO should be converted to use MPIR_Get_contextid_sparse instead */
- mpi_errno = MPIR_Get_contextid( comm_ptr, &new_context_id );
+ mpi_errno = MPIR_Get_contextid_sparse( comm_ptr, &new_context_id,
+ group_ptr->rank == MPI_UNDEFINED );
if (mpi_errno) MPIU_ERR_POP(mpi_errno);
MPIU_Assert(new_context_id != 0);
@@ -278,7 +278,6 @@ int MPIR_Comm_create_intra(MPID_Comm *comm_ptr, MPID_Group *group_ptr,
}
else {
/* This process is not in the group */
- MPIR_Free_contextid( new_context_id );
new_context_id = 0;
}
@@ -294,8 +293,9 @@ fn_fail:
MPIR_Comm_release(*newcomm_ptr, 0/*isDisconnect*/);
new_context_id = 0; /* MPIR_Comm_release frees the new ctx id */
}
- if (new_context_id != 0)
+ if (new_context_id != 0 && group_ptr->rank != MPI_UNDEFINED) {
MPIR_Free_contextid(new_context_id);
+ }
/* --END ERROR HANDLING-- */
goto fn_exit;
}
http://git.mpich.org/mpich.git/commitdiff/86adc1b1f98641139c7e0e5b2387d27db0cd88a9
commit 86adc1b1f98641139c7e0e5b2387d27db0cd88a9
Author: James Dinan <james.dinan at intel.com>
Date: Wed Nov 20 12:42:50 2013 -0700
Improve context ID exhaustion error reporting
Adds a check to determine if context ID allocaiton failed because of
exhaustion or fragmentation and improves error reporting.
Signed-off-by: Ken Raffenetti <raffenet at mcs.anl.gov>
diff --git a/src/mpi/comm/commutil.c b/src/mpi/comm/commutil.c
index 26facb7..8ca467e 100644
--- a/src/mpi/comm/commutil.c
+++ b/src/mpi/comm/commutil.c
@@ -1130,6 +1130,8 @@ int MPIR_Get_contextid_sparse_group(MPID_Comm *comm_ptr, MPID_Group *group_ptr,
/* --BEGIN ERROR HANDLING-- */
int nfree = 0;
int ntotal = 0;
+ int minfree;
+
if (own_mask) {
MPIU_THREAD_CS_ENTER(CONTEXTID,);
mask_in_use = 0;
@@ -1141,9 +1143,29 @@ int MPIR_Get_contextid_sparse_group(MPID_Comm *comm_ptr, MPID_Group *group_ptr,
}
MPIR_ContextMaskStats(&nfree, &ntotal);
- MPIU_ERR_SETANDJUMP3(mpi_errno, MPI_ERR_OTHER,
- "**toomanycommfrag", "**toomanycommfrag %d %d %d",
- nfree, ntotal, ignore_id);
+ if (ignore_id)
+ minfree = INT_MAX;
+ else
+ minfree = nfree;
+
+ if (group_ptr != NULL) {
+ int coll_tag = tag | MPIR_Process.tagged_coll_mask; /* Shift tag into the tagged coll space */
+ mpi_errno = MPIR_Allreduce_group(MPI_IN_PLACE, &minfree, 1, MPI_INT, MPI_MIN,
+ comm_ptr, group_ptr, coll_tag, &errflag);
+ } else {
+ mpi_errno = MPIR_Allreduce_impl(MPI_IN_PLACE, &minfree, 1, MPI_INT,
+ MPI_MIN, comm_ptr, &errflag);
+ }
+
+ if (minfree > 0) {
+ MPIU_ERR_SETANDJUMP3(mpi_errno, MPI_ERR_OTHER,
+ "**toomanycommfrag", "**toomanycommfrag %d %d %d",
+ nfree, ntotal, ignore_id);
+ } else {
+ MPIU_ERR_SETANDJUMP3(mpi_errno, MPI_ERR_OTHER,
+ "**toomanycomm", "**toomanycomm %d %d %d",
+ nfree, ntotal, ignore_id);
+ }
/* --END ERROR HANDLING-- */
}
-----------------------------------------------------------------------
Summary of changes:
src/mpi/comm/comm_create.c | 8 ++++----
src/mpi/comm/commutil.c | 28 +++++++++++++++++++++++++---
2 files changed, 29 insertions(+), 7 deletions(-)
hooks/post-receive
--
MPICH primary repository
More information about the commits
mailing list