[mpich-discuss] Random Aborts in An MPI Program
balaji at anl.gov
Sat Jul 5 18:48:42 CDT 2014
Sorry, 1.0.6 is too old for us to help you here. It might be a bug that was resolved a long time ago.
You can install mpich in your home directory. You don’t need root permissions to install it. Please see the README for such instructions.
On Jul 5, 2014, at 3:34 PM, mark <dimitsas.markos at gmail.com> wrote:
> Hello to all and to whoever is from the United States, Happy 4th of Jully !
> I am having these collective aborts while executing a program. More specifically the error is :
> rank 7 in job 38 Calliope_50667 caused collective abort of all ranks exit status of rank 7: killed by signal 11 [cli_2]: aborting job: Fatal error in MPI_Allgather: Error message texts are not available [cli_4]: aborting job: Fatal error in MPI_Allgather: Error message texts are not available
> But the MPI_Allgather command it's written correctly, since i checked it multiple times and what about the "Error message texts are not available"?
> MPI_Allgather(&docs, 1, MPI_INT, texts_vectors, 1, MPI_INT, MPI_COMM_WORLD);
> Where docs is a int variable holding a single value for each node, and texts_vectors is an int array with the size of the population of the nodes.
> I compile the programm using mpicc -g -o prog prog.c -lm and execute using mpiexec -n number_of_nodes `pwd`/prog
> I am using MPICH2 1.0.6 in a linux cluster machine that i use for the purposes of my bachelor thesis. I know it's an older version, but the machine belongs to an institute, so i don't have the permission to upgrade it.
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
More information about the discuss