[mpich-discuss] process failing...
Ron Palmer
ron.palmer at pgcgroup.com.au
Sat May 24 02:00:35 CDT 2014
Antonio, Rajeev and others,
thanks for your replies and comments on possible causes for the error
messages and failure, I have passed them on to the programmers of the
underlying application. I must admit I do not understand what unexpected
messages are (I am but a mere user), could you perhaps give examples of
typical causes of them? Eg, the cluster it runs on consists of 3 dual
xeon computers with varying cpu clock rating - could these error
messages be due to getting out of synch, expecting results but not
getting them from the slower computer? I have re-started the process but
excluded the slowest computer (2.27GHz, the other two are running at
2.87 and 3.2) as I was running out of ideas.
For your information, this runs well on smaller problems (few computations).
Thanks,
Ron
On 24/05/2014 3:10 AM, Rajeev Thakur wrote:
> Yes. The message below says some process has received 261,895 messages
> for which no matching receives have been posted yet.
>
>
>
> Rajeev
>
>
>> It looks like at least one of your processes is receiving too
>> many unexpected messages, leading to get out of
>> memory. Unexpected messages are those not matching a posted receive
>> on the receiver side. You may check with the application developers
>> to make them review the algorithm or look for any possible bug.
>>
>> Antonio
>
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140524/8a70e551/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 30940 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140524/8a70e551/attachment.png>
More information about the discuss
mailing list