[mpich-discuss] Help troubleshooting a session
mark
dimitsas.markos at gmail.com
Mon Jun 30 06:56:36 CDT 2014
Στις 30/06/2014 01:09 μμ, ο/η mark έγραψε:
> Hello.
> I wrote a program in MPI and for some combinations of data and number
> of nodes it runs ok, but other times it fails to run and returns
> errors (i.e. rank 0 in job 29 Calliope_50394 caused collective abort
> of all ranks exit status of rank 0: killed by signal 9). How i see it,
> for some reason, the division of data into the nodes keeps failing the
> execution. Because if i use 16 nodes and use for example a 320-line
> array as data collection, it executes fine. But if i use a 50-line
> array it fails the execution. Is there a way to troubleshoot the code
> and find where it fails? Also some examples would be great.
>
>
> PS. Because i have wrote other programs in MPI that worked, the only
> difference they have with this one, that keeps failing, is that in
> this one i use loops like this:
>
> for(i=id*(n/p); i< (id+1)*n/p; i++){.....
>
> to parse the data accordingly, but again for some combinations of data
> and number of nodes it worked...
Sorry for double posting.
More information about the discuss
mailing list