[mpich-discuss] return() or exit() from int main() (NOT MPICH SPECIFIC)

Michael L. Stokes Michael.Stokes at uah.edu
Fri Apr 10 12:06:17 CDT 2015


Michael,

Good question.  Since this is the MPICH forum, I ran some test cases on 
a CRAY, and a 64bit PC:

For the CRAY using the cray-mpich/6.3.1 module. The code simply 
initializes MPI and returns its rank.  The code (exit_test.C) follows:

#include <iostream>
#include <cstdlib>
#include <mpi.h>

int main(void) {
     int rank;
     MPI_Init(0,0);
     MPI_Comm_rank( MPI_COMM_WORLD, &rank);
     if( rank == 0 ) {
         int size;
         MPI_Comm_size( MPI_COMM_WORLD, &size);
         std::cout << "Rank:" << rank << ", size:" << size << std::endl;
     }
     MPI_Finalize();
     return rank;
}

I compiled with

CC exit_test.C -o exit_test

and executed with the following (PBS_INTERACTIVE)

aprun -n 7 ./exit_test

I was surprised with the result.  I ran it probably a dozen times.  Here 
are a few runs ...

xxx at batch2:~/src/MPI/tmp> aprun -n 7 ./exit_test
Rank:0, size:7
Application 13573540 exit codes: 6
Application 13573540 resources: utime ~0s, stime ~1s, Rss ~39164, 
inblocks ~10234, outblocks ~23470
xxx at batch2:~/src/MPI/tmp> echo $?
6
xxx at batch2:~/src/MPI/tmp> aprun -n 7 ./exit_test
Rank:0, size:7
Application 13573552 exit codes: 4
Application 13573552 resources: utime ~0s, stime ~1s, Rss ~39164, 
inblocks ~10234, outblocks ~23470
xxx at batch2:~/src/MPI/tmp> echo $?
4
xxx at batch2:~/src/MPI/tmp> aprun -n 7 ./exit_test
Rank:0, size:7
Application 13573564 exit codes: 1
Application 13573564 resources: utime ~0s, stime ~1s, Rss ~39172, 
inblocks ~10234, outblocks ~23470
xxx at batch2:~/src/MPI/tmp> echo $?
1
xxx at batch2:~/src/MPI/tmp> aprun -n 7 ./exit_test
Rank:0, size:7
Application 13573589 exit codes: 5
Application 13573589 resources: utime ~0s, stime ~1s, Rss ~39188, 
inblocks ~10234, outblocks ~23470
xxx at batch2:~/src/MPI/tmp> echo $?
5

I say surprised because the man page taken from 
http://www.mpich.org/static/docs/v3.1/www1/mpiexec.html
reports the following ...


    Return Status

mpiexec returns the maximum of the exit status values of all of the 
processes created by mpiexec.


But the fine Cray folk have probably mucked with the MPI executive, so 
the results running on the XC30 might not agree with the MPICH man page.

I also ran this test on a PC running UBUNTU 12.04, with source built 
MPICH 3.1.2.  Here are a few of those results

Compile line: mpicc exit_test.C -o exit_test

Here are results from this test.  Note than I am varying the processor 
count here.

xxx at Cue ~/tmp/MPI $ mpiexec -np 9 ./exit_test
Rank:0, size:9

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 19062 RUNNING AT Cue
=   EXIT CODE: 1
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
xxx at Cue ~/tmp/MPI $ echo $?
15
xxx at Cue ~/tmp/MPI $ mpiexec -np 8 ./exit_test
Rank:0, size:8

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 19130 RUNNING AT Cue
=   EXIT CODE: 1
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
xxx at Cue ~/tmp/MPI $ echo $?
7
xxx at Cue ~/tmp/MPI $ mpiexec -np 7 ./exit_test
Rank:0, size:7

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 19142 RUNNING AT Cue
=   EXIT CODE: 1
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
xxx at Cue ~/tmp/MPI $ echo $?
7
xxx at Cue ~/tmp/MPI $ mpiexec -np 3 ./exit_test
Rank:0, size:3

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 19158 RUNNING AT Cue
=   EXIT CODE: 1
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
xxx at Cue ~/tmp/MPI $ echo $?
3

So unlike the test on the CRAY, the results on the PC are consistent, 
but do not agree with the man page.

Michael to address your question, it is certainly implementation 
dependent.  There are any number of ways to report the exit status. 
MPICH docs suggest taking the maximum value.  Another approach is to use 
the exit status of the root process.

As a side note, MPICH acknowledges that a non-zero return value results 
in a bad termination.  If I run the later case with np=1, I get

xxx at Cue ~/tmp/MPI $ mpiexec -np 1 ./exit_test
Rank:0, size:1
xxx at Cue ~/tmp/MPI $ echo $?
0

which is what I would expect. --Mike

On 04/09/2015 02:27 PM, Michael Raymond wrote:
>   Hi. I'm the lead developer of SGI MPI.
>
>   Do other MPIs do something different here? As you might have 
> thousands of ranks, I'm wondering how you'd decide which exit code to 
> return?
>
> On 04/09/2015 11:36 AM, Michael L. Stokes wrote:
>> This question is not MPICH specific, but I'm sure the expertise is here
>> to answer this question.
>>
>> While running tests on spirit.afrl.hpc.mil (SGI ICE X) using the
>> MPT/2.11 stack, I noticed that mpirun returns 0 to the shell regardless
>> of the exit value ( <stdlib.h> exit(int) ), or the return value
>> (return(int)) from the main.
>>
>> Would this behavior be regarded as an error?  What are the issues?
>>
>> --Mike
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150410/f0786cdb/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list