[mpich-discuss] Random Aborts in An MPI Program
Balaji, Pavan
balaji at anl.gov
Sat Jul 5 18:48:42 CDT 2014
Mark,
Sorry, 1.0.6 is too old for us to help you here. It might be a bug that was resolved a long time ago.
You can install mpich in your home directory. You don’t need root permissions to install it. Please see the README for such instructions.
Regards,
— Pavan
On Jul 5, 2014, at 3:34 PM, mark <dimitsas.markos at gmail.com> wrote:
> Hello to all and to whoever is from the United States, Happy 4th of Jully !
>
>
> I am having these collective aborts while executing a program. More specifically the error is :
> rank 7 in job 38 Calliope_50667 caused collective abort of all ranks exit status of rank 7: killed by signal 11 [cli_2]: aborting job: Fatal error in MPI_Allgather: Error message texts are not available [cli_4]: aborting job: Fatal error in MPI_Allgather: Error message texts are not available
>
> But the MPI_Allgather command it's written correctly, since i checked it multiple times and what about the "Error message texts are not available"?
>
> MPI_Allgather(&docs, 1, MPI_INT, texts_vectors, 1, MPI_INT, MPI_COMM_WORLD);
>
> Where docs is a int variable holding a single value for each node, and texts_vectors is an int array with the size of the population of the nodes.
>
> I compile the programm using mpicc -g -o prog prog.c -lm and execute using mpiexec -n number_of_nodes `pwd`/prog
> I am using MPICH2 1.0.6 in a linux cluster machine that i use for the purposes of my bachelor thesis. I know it's an older version, but the machine belongs to an institute, so i don't have the permission to upgrade it.
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list