[mpich-discuss] Help troubleshooting a session

mark dimitsas.markos at gmail.com
Mon Jun 30 06:56:36 CDT 2014


Στις 30/06/2014 01:09 μμ, ο/η mark έγραψε:
> Hello.
> I wrote a program in MPI and for some combinations of data and number 
> of nodes it runs ok, but other times it fails to run and returns 
> errors (i.e. rank 0 in job 29 Calliope_50394 caused collective abort 
> of all ranks exit status of rank 0: killed by signal 9). How i see it, 
> for some reason, the division of data into the nodes keeps failing the 
> execution. Because if i use 16 nodes and use for example a 320-line 
> array as data collection, it executes fine. But if i use a 50-line 
> array it fails the execution. Is there a way to troubleshoot the code 
> and find where it fails? Also some examples would be great.
>
>
> PS. Because i have wrote other programs in MPI that worked, the only 
> difference they have with this one, that keeps failing, is that in 
> this one i use loops like this:
>
> for(i=id*(n/p); i< (id+1)*n/p; i++){.....
>
> to parse the data accordingly, but again for some combinations of data 
> and number of nodes it worked...
Sorry for double posting.



More information about the discuss mailing list