[mpich-discuss] Mpich + Siesta erro

Julio Henrique juliohenrique at msn.com
Fri Dec 6 19:17:05 CST 2013


Ok Jeff. Thank you.
julio.







 










 




 

 
CC: discuss at mpich.org
From: jeff.science at gmail.com
Date: Fri, 6 Dec 2013 19:09:36 -0600
To: discuss at mpich.org
Subject: Re: [mpich-discuss] Mpich + Siesta erro

I bet you (1) Siesta doesn't check MPI return codes and (2) Siesta has no way to handle node failure. I bet it can't even handle malloc returning NULL. 
If a node fails more than once a month, the hardware is bad and you should buy new stuff. 
Jeff

Sent from my iPhone
On Dec 6, 2013, at 7:04 PM, Julio Henrique <juliohenrique at msn.com> wrote:





Okay Pavan. I'll try that. Then I'll return the result.
Thank's.
Julio.





 











 




 

 
> From: balaji at mcs.anl.gov
> Date: Fri, 6 Dec 2013 19:01:08 -0600
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] Mpich + Siesta erro
> 
> Hi Julio,
> 
> There are two steps needed for this:
> 
> 1. You need to tell your MPI application to return errors instead of aborting.
> 
> 2. Tell the process manager to not clean up your remaining processes when one of the processes dies.
> 
> Details on both these steps are listed in the "Fault Tolerance” section of the MPICH README.  Please try it out and let us know how it goes.
> 
>   — Pavan
> 
> On Dec 6, 2013, at 6:54 PM, Julio Henrique <juliohenrique at msn.com> wrote:
> 
> > 
> > I am using mpich-3.0.4 on cluster with 7 nodes running the latest version of siesta. My problem is that when a one node goes down, the siesta and mpich stops running and giveserror.
> > How do I get when a node falls, siesta and mpich continue to run?
> > Thank's.
> > Julio.
> >  
> >  
> >  
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> 
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
 		 	   		  
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20131207/97d8577d/attachment.html>


More information about the discuss mailing list