[mpich-commits] [mpich] MPICH primary repository branch, master, updated. v3.2a2-115-g93e816c
Service Account
noreply at mpich.org
Thu Jan 22 11:31:20 CST 2015
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "MPICH primary repository".
The branch, master has been updated
via 93e816cc1fbd88109c55f9acf9b5ed6efc2a12d2 (commit)
from a3dd5f401f3d038fd26eb6c93a7497566228213f (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
http://git.mpich.org/mpich.git/commitdiff/93e816cc1fbd88109c55f9acf9b5ed6efc2a12d2
commit 93e816cc1fbd88109c55f9acf9b5ed6efc2a12d2
Author: Huiwei Lu <huiweilu at mcs.anl.gov>
Date: Thu Jan 22 10:02:15 2015 -0600
FT: Fixes ref counts in shrink and agree
When process fails, fault tolerance scheme takes a different path to
deal with MPI object reference counts than the existing one. Some
reference counts were not properly set in FT path so when configured
with --enable-g=all, some ft tests will show leaked context id, dirty
COMM, GROUP and REQUEST objects and so on when exit.
This patch fixes ft/shrink and ft/agree with "--enable-g=all". Stack
allocated objects of requests, communicators and groups will be freed by
FT.
Signed-off-by: Wesley Bland <wbland at anl.gov>
diff --git a/src/mpi/coll/helper_fns.c b/src/mpi/coll/helper_fns.c
index 588092f..d0c2e95 100644
--- a/src/mpi/coll/helper_fns.c
+++ b/src/mpi/coll/helper_fns.c
@@ -499,6 +499,10 @@ int MPIC_Sendrecv(const void *sendbuf, int sendcount, MPI_Datatype sendtype,
MPIDI_FUNC_EXIT(MPID_STATE_MPIC_SENDRECV);
return mpi_errno;
fn_fail:
+ if (send_req_ptr)
+ MPID_Request_release(send_req_ptr);
+ if (recv_req_ptr)
+ MPID_Request_release(recv_req_ptr);
goto fn_exit;
}
diff --git a/src/mpi/comm/comm_shrink.c b/src/mpi/comm/comm_shrink.c
index 631edb5..0466db5 100644
--- a/src/mpi/comm/comm_shrink.c
+++ b/src/mpi/comm/comm_shrink.c
@@ -81,7 +81,16 @@ int MPIR_Comm_shrink(MPID_Comm *comm_ptr, MPID_Comm **newcomm_ptr)
new_group_ptr, MPIR_SHRINK_TAG, &errflag);
MPIR_Group_release(new_group_ptr);
- if (errflag) MPIU_Object_set_ref(new_group_ptr, 0);
+ if (errflag) {
+ if (*newcomm_ptr != NULL && MPIU_Object_get_ref(*newcomm_ptr) > 0) {
+ MPIU_Object_set_ref(*newcomm_ptr, 1);
+ MPIR_Comm_release(*newcomm_ptr, 0);
+ }
+ if (MPIU_Object_get_ref(new_group_ptr) > 0) {
+ MPIU_Object_set_ref(new_group_ptr, 1);
+ MPIR_Group_release(new_group_ptr);
+ }
+ }
} while (errflag && ++attempts < 5);
if (errflag && attempts >= 5) goto fn_fail;
diff --git a/src/mpid/ch3/src/mpid_comm_get_all_failed_procs.c b/src/mpid/ch3/src/mpid_comm_get_all_failed_procs.c
index a991435..465d24f 100644
--- a/src/mpid/ch3/src/mpid_comm_get_all_failed_procs.c
+++ b/src/mpid/ch3/src/mpid_comm_get_all_failed_procs.c
@@ -107,6 +107,8 @@ int MPID_Comm_get_all_failed_procs(MPID_Comm *comm_ptr, MPID_Group **failed_grou
bitarray = group_to_bitarray(local_fail, comm_ptr);
bitarray_size = (comm_ptr->local_size / 8) + (comm_ptr->local_size % 8 ? 1 : 0);
remote_bitarray = MPIU_Malloc(sizeof(uint32_t) * bitarray_size);
+ if (local_fail != MPID_Group_empty)
+ MPIR_Group_release(local_fail);
/* For now, this will be implemented as a star with rank 0 serving as
* the source */
-----------------------------------------------------------------------
Summary of changes:
src/mpi/coll/helper_fns.c | 4 ++++
src/mpi/comm/comm_shrink.c | 11 ++++++++++-
src/mpid/ch3/src/mpid_comm_get_all_failed_procs.c | 2 ++
3 files changed, 16 insertions(+), 1 deletions(-)
hooks/post-receive
--
MPICH primary repository
More information about the commits
mailing list