[mpich-discuss] Help troubleshooting a session

"Antonio J. Peña" apenya at mcs.anl.gov
Mon Jun 30 10:42:42 CDT 2014


Hi Mark,

That's not enough information for us to determine if your code is doing 
something inappropriate or if you're hitting a bug in MPICH. Usually we 
request reporters to submit the smallest piece of code with which that 
they can reproduce the problem. From your description, though, I'd say 
you could have something wrong on your code, so I'd suggest first asking 
for assistance in some place like stackoverflow to discard this before 
you file a possible bug to us.

Thanks,
Antonio


On 06/30/2014 06:56 AM, mark wrote:
> Στις 30/06/2014 01:09 μμ, ο/η mark έγραψε:
>> Hello.
>> I wrote a program in MPI and for some combinations of data and number
>> of nodes it runs ok, but other times it fails to run and returns
>> errors (i.e. rank 0 in job 29 Calliope_50394 caused collective abort
>> of all ranks exit status of rank 0: killed by signal 9). How i see it,
>> for some reason, the division of data into the nodes keeps failing the
>> execution. Because if i use 16 nodes and use for example a 320-line
>> array as data collection, it executes fine. But if i use a 50-line
>> array it fails the execution. Is there a way to troubleshoot the code
>> and find where it fails? Also some examples would be great.
>>
>>
>> PS. Because i have wrote other programs in MPI that worked, the only
>> difference they have with this one, that keeps failing, is that in
>> this one i use loops like this:
>>
>> for(i=id*(n/p); i< (id+1)*n/p; i++){.....
>>
>> to parse the data accordingly, but again for some combinations of data
>> and number of nodes it worked...
> Sorry for double posting.
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss


-- 
Antonio J. Peña
Postdoctoral Appointee
Mathematics and Computer Science Division
Argonne National Laboratory
9700 South Cass Avenue, Bldg. 240, Of. 3148
Argonne, IL 60439-4847
apenya at mcs.anl.gov
www.mcs.anl.gov/~apenya




More information about the discuss mailing list