[mpich-discuss] Mpich + Siesta erro

Pavan Balaji balaji at mcs.anl.gov
Fri Dec 6 19:01:08 CST 2013


Hi Julio,

There are two steps needed for this:

1. You need to tell your MPI application to return errors instead of aborting.

2. Tell the process manager to not clean up your remaining processes when one of the processes dies.

Details on both these steps are listed in the "Fault Tolerance” section of the MPICH README.  Please try it out and let us know how it goes.

  — Pavan

On Dec 6, 2013, at 6:54 PM, Julio Henrique <juliohenrique at msn.com> wrote:

> 
> I am using mpich-3.0.4 on cluster with 7 nodes running the latest version of siesta. My problem is that when a one node goes down, the siesta and mpich stops running and giveserror.
> How do I get when a node falls, siesta and mpich continue to run?
> Thank's.
> Julio.
>  
>  
>  
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

--
Pavan Balaji
http://www.mcs.anl.gov/~balaji




More information about the discuss mailing list