[mpich-commits] [mpich] MPICH primary repository branch, master, updated. v3.0.4-440-g254aa2c
mysql vizuser
noreply at mpich.org
Tue Aug 6 17:45:44 CDT 2013
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "MPICH primary repository".
The branch, master has been updated
via 254aa2cdaba145bf9c7f6a42665cae32d1b31685 (commit)
from 4824c7620ed2da59c1bbb9411b414725462972d0 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
http://git.mpich.org/mpich.git/commitdiff/254aa2cdaba145bf9c7f6a42665cae32d1b31685
commit 254aa2cdaba145bf9c7f6a42665cae32d1b31685
Author: Michael Blocksome <blocksom at us.ibm.com>
Date: Tue Aug 6 11:32:15 2013 -0500
Clean up pamid MPID_Abort() logic
Removed the processing of the (undocumented) environment variable
'PAMID_CORE_ON_ABORT' which was being checked to determine if the user
does *not* want the process to core dump. On Blue Gene/Q the core dump
was accomplished by calling 'abort()' which sends SIGSBRT to all
processes and all processes would then write a core file. This is not
scalable.
Instead, MPID_Abort() will invoke 'exit(1)' which will terminate all
processes in the job. This behavior is identical for both the POE and
the Blue Gene/Q control systems.
On Blue Gene/Q the user may replicate the previous core dump behavior by
using the environment variables 'BG_COREDUMPONERROR=1' or
'BG_COREDUMPONEXIT=1'.
Finally, the 'DYNAMIC_TASKING' #ifdef is moved up so it is checked first.
'MPIDI_NO_ASSERT' and 'DYNAMIC_TASKING' are typically defined for PE. It
appears that the dynamic tasking code was never being invoked.
(ibm) CPS 99YURA
Signed-off-by: Bob Cernohous <bobc at us.ibm.com>
diff --git a/src/mpid/pamid/src/misc/mpid_abort.c b/src/mpid/pamid/src/misc/mpid_abort.c
index ef8af06..dc618af 100644
--- a/src/mpid/pamid/src/misc/mpid_abort.c
+++ b/src/mpid/pamid/src/misc/mpid_abort.c
@@ -27,8 +27,8 @@
*
* \param[in] comm The communicator associated with the failure (can be null).
* \param[in] mpi_errno The MPI error associated with the failure (can be zero).
- * \param[in] exit_code The requested exit code, however BG features imply that exit(1) will always be used.
- * \param[in] error_msg The message to display (may be NULL_
+ * \param[in] exit_code The requested exit code.
+ * \param[in] error_msg The message to display (may be NULL)
*
* This is the majority of the call to MPID_Abort(). The only
* difference is that it does not call exit. That allows it to be
@@ -74,26 +74,28 @@ void MPIDI_Abort_core(MPID_Comm * comm, int mpi_errno, int exit_code, const char
* \brief The central parts of the MPID_Abort call
* \param[in] comm The communicator associated with the failure (can be null).
* \param[in] mpi_errno The MPI error associated with the failure (can be zero).
- * \param[in] exit_code The requested exit code, however BG features imply that exit(1) will always be used.
- * \param[in] error_msg The message to display (may be NULL_
- * \returns MPI_ERR_INTERN
+ * \param[in] exit_code The requested exit code.
+ * \param[in] error_msg The message to display (may be NULL)
+ * \return MPI_ERR_INTERN
*
* This function MUST NEVER return.
*/
int MPID_Abort(MPID_Comm * comm, int mpi_errno, int exit_code, const char *error_msg)
{
- char* env = getenv("PAMID_CORE_ON_ABORT");
MPIDI_Abort_core(comm, mpi_errno, exit_code, error_msg);
-#ifdef MPIDI_NO_ASSERT
- exit(1);
-#endif
- if (env != NULL)
- if ( (strncasecmp("no", env, 2)==0) || (strncasecmp("exit", env, 4)==0) || (strncmp("0", env, 1)==0) )
- exit(1);
-
#ifdef DYNAMIC_TASKING
return PMI2_Abort(1,error_msg);
+#else
+ /* The POE and BGQ control systems both catch the exit value for additional
+ * processing. If a process exits with '1' then all processes in the job
+ * are terminated. The requested error code is lost in this process however
+ * this is acceptable, but not desirable, behavior according to the MPI
+ * standard.
+ *
+ * On BGQ, the user may force the process (rank) that exited with '1' to core
+ * dump by setting the environment variable 'BG_COREDUMPONERROR=1'.
+ */
+ exit(1);
#endif
- abort();
}
-----------------------------------------------------------------------
Summary of changes:
src/mpid/pamid/src/misc/mpid_abort.c | 30 ++++++++++++++++--------------
1 files changed, 16 insertions(+), 14 deletions(-)
hooks/post-receive
--
MPICH primary repository
More information about the commits
mailing list