<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Στις 02/07/2014 11:02 μμ, ο/η mark
έγραψε:<br>
</div>
<blockquote cite="mid:53B46549.7080904@gmail.com" type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
Hello.<br>
In a program i am writing, i have an array which size is equal to
the number of nodes in my cluster. <br>
I made this array to keep an eye of the objects that each node has
in his control and when i use it with the actual ranks of the
nodes as positions ( <b>array[node_rank]=objects;</b>) it returns
me an error and stops the execution:<br>
<br>
<b>rank 0 in job 10 Calliope_49755 caused collective abort of
all ranks</b><b><br>
</b><b> exit status of rank 0: killed by signal 11 </b><br>
<br>
However, i had freely used the variable <i>id</i> or <i>node_rank</i>
to point out a specific action for a specific node since now and
never had problems.<br>
<br>
If instead of using the ranks node as a pointer for the array, i
use an integer i.e. 5, the program runs fine, but even now, 1 out
of 7 executions, it returns me an error and stops. The action i am
calling that keeps returning me the error is a simple abstraction
(<b>array[id]--;</b>).<br>
<br>
Any ideas?<br>
<br>
<br>
PS. The array is 1d and is created by allocating memory, like this
: <b>int *array = malloc(processes * sizeof(int));</b><br>
</blockquote>
I forgot to mention how i compiled and executed the program :<br>
mpich -o prog prog.c -lm<br>
mpiexec -n nodes `pwd`/prog<br>
</body>
</html>