[mpich-discuss] turning off MPI abort messages

Jim Dinan james.dinan at gmail.com
Fri Feb 21 13:28:33 CST 2014


Barry,

Do you need a portable solution that works across different MPI
implementations, or does a solution for making just MPICH silent address
your need?  If the latter, you could probably convince someone on the MPICH
team to add an environment variable for MPICH and a command-line "quiet"
flag for hydra/mpiexec.

 ~Jim.


On Fri, Feb 21, 2014 at 2:13 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   Understood. But I would like to eliminate both sets of error messages
> and still provide a useful "return code". Perhaps compile time options to
> the library?
>
>    Barry
>
> On Feb 21, 2014, at 12:40 PM, Jim Dinan <james.dinan at gmail.com> wrote:
>
> > A little more detail -- you're actually getting messages from two
> sources: (1) the MPICH library ("application called MPI_Abort...") and (2)
> the job launcher ("BAD TERMINATION...").  You can eliminate the messages
> from the job launcher by providing an error code of 0 in MPI_Abort.
> >
> >  ~Jim.
> >
> >
> >
> >
> > On Fri, Feb 21, 2014 at 1:19 PM, Jeff Hammond <jeff.science at gmail.com>
> wrote:
> > >> Just configure MPICH such that snprintf isn't discovered by configure
> > >> and you won't see these messages.
> > >>
> > >> The other solution is to fix PETSc so that people can't crash it so
> easily ;-)
> > >
> > >    Here we go again. It is not CRASHING; it has detected an error
> conditioning and trying to appropriately and cleanly terminate. The reason
> it needs to use MPI_Abort() is that often detecting error conditions is not
> a uniformly collective thing.
> > >
> > >     Printing a suitable error message and ending is not crashing. But
> with all the badly formatted "error messages" printed by MPICH I can not
> control at the end it looks like it is crashing.
> >
> > You're returning a non-zero exit code, which I consider crashing.  I
> > apologize if this definition disagrees with yours.  If this is just
> > gentle cleanup, why not exit with code=0 as Jim suggested already?
> >
> > Jeff
> >
> > >> On Thu, Feb 20, 2014 at 3:19 PM, Jim Dinan <james.dinan at gmail.com>
> wrote:
> > >>> If you can find a way to call MPI_Finalize instead, you will portably
> > >>> eliminate these messages.
> > >>>
> > >>> A lesser solution would be to provide an error code of 0 (or
> MPI_SUCCESS) to
> > >>> MPI_Abort, e.g. MPI_Comm_abort(MPI_COMM_WORLD, MPI_SUCCESS).  This
> would
> > >>> eliminate the error message that you are getting from the job
> launcher.
> > >>> MPICH could be modified to be quiet about the abort when the
> application
> > >>> aborts with an error code of MPI_SUCCESS.
> > >>>
> > >>> ~Jim.
> > >>>
> > >>>
> > >>> On Thu, Feb 20, 2014 at 12:33 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > >>>>
> > >>>>
> > >>>>   Is there any way to turn off MPICH (and others) printing messages
> about
> > >>>> MPI_Abort?  We have already prepared and presented useful error
> messages to
> > >>>> the user about the situation and would like to avoid having these
> additional
> > >>>> messages printed (that often make the situation look worse than it
> is)
> > >>>>
> > >>>>    Thanks
> > >>>>
> > >>>>   Barry
> > >>>>
> > >>>> application called MPI_Abort(MPI_COMM_WORLD, 56) - process 0
> > >>>> [cli_0]: aborting job:
> > >>>> application called MPI_Abort(MPI_COMM_WORLD, 56) - process 0
> > >>>>
> > >>>>
> > >>>>
> ==================================================================mailto:
> discuss at mpich.org=================
> > >>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> > >>>> =   EXIT CODE: 56
> > >>>> =   CLEANING UP REMAINING PROCESSES
> > >>>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> > >>>>
> > >>>>
> ===================================================================================
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> _______________________________________________
> > >>>> discuss mailing list     discuss at mpich.org
> > >>>> To manage subscription options or unsubscribe:
> > >>>> https://lists.mpich.org/mailman/listinfo/discuss
> > >>>
> > >>>
> > >>>
> > >>> _______________________________________________
> > >>> discuss mailing list     discuss at mpich.org
> > >>> To manage subscription options or unsubscribe:
> > >>> https://lists.mpich.org/mailman/listinfo/discuss
> > >>
> > >>
> > >>
> > >> --
> > >> Jeff Hammond
> > >> jeff.science at gmail.com
> > >> _______________________________________________
> > >> discuss mailing list     discuss at mpich.org
> > >> To manage subscription options or unsubscribe:
> > >> https://lists.mpich.org/mailman/listinfo/discuss
> > >
> > > _______________________________________________
> > > discuss mailing list     discuss at mpich.org
> > > To manage subscription options or unsubscribe:
> > > https://lists.mpich.org/mailman/listinfo/discuss
> >
> >
> >
> > --
> > Jeff Hammond
> > jeff.science at gmail.com
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> >
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140221/42d10f38/attachment.html>


More information about the discuss mailing list