[mpich-discuss] Random Aborts in An MPI Program

mark dimitsas.markos at gmail.com
Sat Jul 5 15:34:07 CDT 2014


Hello to all and to whoever is from the United States, /Happy 4th of 
Jully/ !


I am having these collective aborts while executing a program. More 
specifically the error is :
/rank 7 in job 38  Calliope_50667   caused collective abort of all 
ranks   exit status of rank 7: killed by signal 11  [cli_2]: aborting 
job: Fatal error in MPI_Allgather: Error message texts are not available 
[cli_4]: aborting job: Fatal error in MPI_Allgather: Error message texts 
are not available//
/
But the /MPI_Allgather/ command it's written correctly, since i checked 
it multiple times and what about the "/Error message texts are not 
available/"?

/MPI_Allgather(&docs, 1, MPI_INT, texts_vectors, 1, MPI_INT, 
MPI_COMM_WORLD);///

Where /docs/ is a int variable holding a single value for each node, and 
/texts_vectors/ is an int array with the size of the population of the 
nodes.

I compile the programm using /mpicc -g -o prog prog.c -lm/ and execute 
using /mpiexec -n number_of_nodes `pwd`/prog/
I am using MPICH2 1.0.6 in a linux cluster machine that i use for the 
purposes of my bachelor thesis. I know it's an older version, but the 
machine belongs to an institute, so i don't have the permission to 
upgrade it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140705/263b35a8/attachment.html>


More information about the discuss mailing list