[mpich-discuss] Specifying a timeout for MPI_Comm_Accept

Matthieu Dorier matthieu.dorier at irisa.fr
Mon Sep 8 03:17:42 CDT 2014


Hi Pavan,

With all the work on in situ analysis/visualization, non-blocking versions of MPI_Comm_accept/connect could become very useful in the future to easily couple simulations and visualization codes.

Here is an example: the VisIt software can use in situ, but right now the connection between VisIt and the simulation is done through a normal socket. The simulation will periodically call VisItDetectInput to check (in a non-blocking manner) if VisIt connected to the simulation.

I recently had to use VisIt with a master-worker type simulation, that is, the master has an MPI_Recv(MPI_ANY_SOURCE...) to wait for workers to finish pieces of work. When using VisIt, this simple MPI_Recv becomes a non-blocking receive an active loop alternating between an MPI_Testany and VisItDetectInput. If VisIt could leverage non-blocking accept/connect, this active loop could be replaced with an MPI_Waitany on a set of MPI_Requests, one of them being VisIt connecting (the result of an MPI_Comm_iaccept), the other being workers finishing some work.

I'm not a VisIt developer, though. This is just a thought.

Matthieu Dorier 
PhD student at ENS Rennes 
http://people.irisa.fr/Matthieu.Dorier 
----- Mail original -----
> De: "Pavan Balaji" <balaji at anl.gov>
> À: discuss at mpich.org
> Envoyé: Dimanche 7 Septembre 2014 00:53:58
> Objet: Re: [mpich-discuss] Specifying a timeout for MPI_Comm_Accept
> 
> Hirak,
> 
> The function definitions for MPI_Comm_accept, etc., are a part of the MPI
> standard.  We cannot change them in MPICH without changing them in the MPI
> standard first.  Changes to the MPI standard go through the MPI Forum, and
> through a formal proposal and voting process before they get in.
> 
> FWIW, both a timeout model and a nonblocking connect/accept have been
> proposed in the past, but they were both voted down.  The nonblocking
> connect/accept proposal was originally done by Josh Hursey
> (http://www.cs.uwlax.edu/~jjhursey/), but I’m planning to revive the ticket
> but more broadly asking for nonblocking variants for many other operations
> as well.  It’s unclear if/when this will get in, but we can try.
> 
> The timeout proposal was put together by Jeff Squyres @ Cisco and Fab Tillier
> @ Microsoft (they are both on this list).  I personally thought it was a
> very elegant proposal, but it was voted down because there was no use case
> for it at the time, particularly given that there was no standardized fault
> model in MPI.  Once the Fault Tolerance working group gets its proposal in,
> there might be room to revisit this.  But you’ll need to talk to the above
> mentioned guys to see if they are planning to revive it.
> 
> Hope that helps.
> 
>   — Pavan
> 
> On Sep 6, 2014, at 12:35 PM, Roy, Hirak <Hirak_Roy at mentor.com> wrote:
> 
> > Hi,
> > 
> > The thread at the end of my email, shows that there is no way we can
> > specify a timeout in MPI_Comm_accept/connect.
> > Since the thread is pretty old (2007), I would like to know if there is any
> > development related to this or not?
> > 
> > If we still can not specify a timeout, is there any provision of
> > non-blocking accept/connect ?
> > 
> > Thanks,
> > Hirak
> > 
> > 
> > https://lists.mcs.anl.gov/mailman/htdig/mpich-discuss/2007-April/002159.html
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> 
> --
> Pavan Balaji  ✉️
> http://www.mcs.anl.gov/~balaji
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss



More information about the discuss mailing list