[mpich-discuss] Fault tolerance after MPI_Comm_connect/accept
Pavan Balaji
balaji at mcs.anl.gov
Tue Mar 5 10:43:06 CST 2013
On 03/05/2013 10:34 AM US Central Time, Jim Dinan wrote:
> You should ignore my comments on MPICH FT. My info is clearly
> out-of-date. It sounds like what you're looking for should be fully
> supported. :)
Well, almost :-).
Some things could not be done cleanly while staying within MPI-3. For
example, when you do a wildcard receive, it will always return an error
if any process in the communicator is dead. The MPI Forum is working on
fixing this by allowing the user to "opt in" for this wildcard stuff.
I'm a few months behind on the MPI-3.1 fault-tolerance proposal, but I
believe this is still present.
-- Pavan
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the discuss
mailing list