[mpich-discuss] Possible bug MPICH2 1.0.6
mark
dimitsas.markos at gmail.com
Wed Jul 2 15:04:28 CDT 2014
???? 02/07/2014 11:02 ??, ?/? mark ??????:
> Hello.
> In a program i am writing, i have an array which size is equal to the
> number of nodes in my cluster.
> I made this array to keep an eye of the objects that each node has in
> his control and when i use it with the actual ranks of the nodes as
> positions ( *array[node_rank]=objects;*) it returns me an error and
> stops the execution:
>
> *rank 0 in job 10 Calliope_49755 caused collective abort of all ranks**
> ** exit status of rank 0: killed by signal 11 *
>
> However, i had freely used the variable /id/ or /node_rank/ to point
> out a specific action for a specific node since now and never had
> problems.
>
> If instead of using the ranks node as a pointer for the array, i use
> an integer i.e. 5, the program runs fine, but even now, 1 out of 7
> executions, it returns me an error and stops. The action i am calling
> that keeps returning me the error is a simple abstraction
> (*array[id]--;*).
>
> Any ideas?
>
>
> PS. The array is 1d and is created by allocating memory, like this :
> *int *array = malloc(processes * sizeof(int));*
I forgot to mention how i compiled and executed the program :
mpich -o prog prog.c -lm
mpiexec -n nodes `pwd`/prog
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140702/5e87b017/attachment.html>
More information about the discuss
mailing list