[mpich-discuss] Possible bug MPICH2 1.0.6
dimitsas.markos at gmail.com
Wed Jul 2 15:02:17 CDT 2014
In a program i am writing, i have an array which size is equal to the
number of nodes in my cluster.
I made this array to keep an eye of the objects that each node has in
his control and when i use it with the actual ranks of the nodes as
positions ( *array[node_rank]=objects;*) it returns me an error and
stops the execution:
*rank 0 in job 10 Calliope_49755 caused collective abort of all ranks**
** exit status of rank 0: killed by signal 11 *
However, i had freely used the variable /id/ or /node_rank/ to point out
a specific action for a specific node since now and never had problems.
If instead of using the ranks node as a pointer for the array, i use an
integer i.e. 5, the program runs fine, but even now, 1 out of 7
executions, it returns me an error and stops. The action i am calling
that keeps returning me the error is a simple abstraction (*array[id]--;*).
PS. The array is 1d and is created by allocating memory, like this :
*int *array = malloc(processes * sizeof(int));*
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the discuss