[mpich-discuss] turning off MPI abort messages

Jed Brown jed at jedbrown.org
Fri Feb 21 23:35:10 CST 2014


Jeff Hammond <jeff.science at gmail.com> writes:

> Did you look at my patch or the demonstration yet? I posted all the details this afternoon. 

Yeah, I wrote the message on the plane before I could read it.

> I tried very hard to support verbosity suppression in a reasonable way at runtime. 
>
> Do you really want an MPIX call that is equivalent to setenv("<see
> patch>")? Is the extra code worth it? (These are serious questions.)

My understanding of these variables is that they are processed in
MPI_Init.  PETSc may not have access to MPI_Init, so we're too late to
influence the environment variable.  But if the user gets a return code
and calls our error-handler (as we encourage them to do if they have
nothing better to do with it), we'd like to be able to exit cleanly.  A
global setting isn't as good because if the user encounters an error
condition and "crashes", it probably makes sense for MPICH to print the
information.


The reason this issue came up is that we have a sizable fraction of
support email in which the exact question is answered in the error
message we print when they call our error handler.  But a lot of users
don't read the error message and worse, they don't copy the whole thing
into the email.  This happens almost every day and requires an extra
round-trip on the list the resolve.  Our thinking was that if we can
make the error message visually cleaner, they may be more likely to read
it and answer their own question or copy the message into an email in
which case we can ask them to read the relevant lines of the error
message.  (A large fraction of these issues are because the user has
configured something incompatible, usually via run-time options.  It is
very analogous to asking for a file that doesn't exist.  A
straightforward change of run-time options will get them running
correctly.)

The MPICH Abort messages add a lot of visual clutter that we hypothesize
makes people less likely to read our messages (telling them how to fix
the problem), or to believe the output does not have useful information.

I think something along the lines of MPIX_Abort_quietly(comm,err) would
be the right granularity since it allows our error handler to suppress
the clutter without influencing the behavior observed in other libraries
or the application.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140221/6e2d788b/attachment.sig>


More information about the discuss mailing list