[mpich-discuss] Getting an error in MPI_Comm_accept - ideas?

Alexander Rast alex.rast.technical at gmail.com
Thu Oct 25 18:02:54 CDT 2018


It has taken a while to condense this down to something small enough
because the whole thing is embedded within a large and reasonably complex
software infrastructure of which parts were not trivial to strip away.

I have tested a few further cases including using an
MPI_Improbe/MPI_Imreceive combination in the MPISpinner() without success.
The code I am sending reliably crashes on the MPI_Comm_connect at the end,
failing within MPI_Comm_accept with an error message essentially similar to
the one I have in the previous message. It seems clear the bug is related
to some sort of message race, because depending upon the exact sequence
(within the actual application we can have numerous possible scenarios
before the final ending MPI_Comm_connect), the application may exit
cleanly, give an error similar to the above, or give a variety of other,
different errors. It's timing/state dependent in some non-obvious way.

I'm running MPICH 3.2.1 with, in this case, a fairly standard set of
options:

--build=x86_64-linux-gnu --prefix=/usr --includedir=${prefix}/include
--mandir=${prefix}/share/man --infodir=${prefix}/share/info
--sysconfdir=/etc --localstatedir=/var --disable-silent-rules
--libdir=${prefix}/lib/x86_64-linux-gnu
--libexecdir=${prefix}/lib/x86_64-linux-gnu --disable-maintainer-mode
--disable-dependency-tracking --enable-shared --prefix=/usr
--enable-fortran=all --disable-rpath --disable-wrapper-rpath
--sysconfdir=/etc/mpich --libdir=/usr/lib/x86_64-linux-gnu
--includedir=/usr/include/mpich --docdir=/usr/share/doc/mpich
--with-hwloc-prefix=system --enable-checkpointing
--with-hydra-ckpointlib=blcr CPPFLAGS= CFLAGS= CXXFLAGS= FFLAGS= FCFLAGS=

I also have a version with the performance-monitoring features enabled that
I use from time to time, which also displays similar behaviour, although
the exact output changes (presumably because differences in the
performance-monitoring code change the execution timings or sequence).

On Thu, Sep 27, 2018 at 2:39 PM Amer, Abdelhalim <aamer at anl.gov> wrote:

> Hi,
>
> Can you pack the smallest test example in a single file so that we
> compile and run it? Also please give us more information about the MPICH
> version you are using and how it was built (run the `mpichversion`
> binary to get this information). It would be better to use the latest
> version (3.2.1) before reporting back. All of this is to help us
> reproduce your problem or simply solving it by upgrading to a newer
> version of MPICH.
>
> Halim
> www.mcs.anl.gov/~aamer
>
> On 9/21/18 10:39 AM, Alexander Rast wrote:
> > All,
> >
> > I'm running an MPI_Comm_accept in a separate thread whose purpose is to
> > allow connections from other (hitherto unknown) MPI universes. A
> > well-known issue of such configurations is that MPI_Comm_accept being
> > blocking, in order for the application to exit, a connection MUST be
> > made even if no actual other universes attempted to connect. In this
> > situation (see Gropp, et al. "Using Advanced MPI", section 6.5) it seems
> > the 'customary' solution is to have the local universe connect to itself
> > and then shut down.
> >
> > However, I'm getting the following error on exit:
> >
> > Fatal error in PMPI_Comm_accept: Message truncated, error stack:
> > PMPI_Comm_accept(129).............:
> > MPI_Comm_accept(port="tag#0$description#aesop.cl.cam.ac.uk
> > <http://aesop.cl.cam.ac.uk>$port#43337$ifname#128.232.98.176$",
> > MPI_INFO_NULL, root=6, MPI_COMM_WORLD, newcomm=0x1473f24) failed
> > MPID_Comm_accept(153).............:
> > MPIDI_Comm_accept(1005)...........:
> > MPIR_Bcast_intra(1249)............:
> > MPIR_SMP_Bcast(1088)..............:
> > MPIR_Bcast_binomial(239)..........:
> > MPIDI_CH3U_Receive_data_found(131): Message from rank 0 and tag 2
> > truncated; 260 bytes received but buffer size is 12
> >
> > You typically get several of these messages, the number seems to vary
> > from trial to trial, I'm guessing it comes from different MPI processes
> > (although there is no fixed relationship between number of processes
> > started and number of error messages)
> >
> > Does anyone have any suggestions on what types of problems might cause
> > this error? (I'm not expecting you to identify and debug the problem
> > specifically, unless, perhaps, this error is indicative of some
> > particular mistake, just would like any hints on where to look)
> >
> > If it helps, here are the 3 main relevant functions involved - there is,
> > of course, a lot more going on in the application besides these, but the
> > error is occurring at shutdown and by this point the rest of the
> > application is quiescent. Also note that the Connect routine is NOT
> > called at shutdown; it's invoked when a 'real' universe wants to connect.
> >
> >
> //------------------------------------------------------------------------------
> >
> > void* CommonBase::Accept(void* par)
> > /* Blocking routine to connect to another MPI universe by publishing a
> port.
> >    This operates in a separate thread to avoid blocking the whole
> process.
> > */
> > {
> > CommonBase* parent=static_cast<CommonBase*>(par);
> >
> > while (parent->AcceptConns.load(std::memory_order_relaxed))
> > {
> > // run the blocking accept itself.
> > if
> >
> (MPI_Comm_accept(parent->MPIPort.load(std::memory_order_seq_cst),MPI_INFO_NULL,parent->Lrank.load(std::memory_order_relaxed),MPI_COMM_WORLD,parent->Tcomm.load(std::memory_order_seq_cst)))
> > {
> >    printf("Error: attempt to connect to another MPI universe failed\n");
> >    parent->AcceptConns.store(false,std::memory_order_relaxed);
> >    break;
> > }
> > // Now trigger the Connect process in the main thread to complete the
> setup
> > PMsg_p Creq;
> > string N("");              // zero-length string indicates a server-side
> > connection
> > Creq.Put(0,&N);
> > Creq.Key(Q::SYST,Q::CONN);
> > Creq.Src(0);
> > Creq.comm = MPI_COMM_SELF;
> > Creq.Send(0);
> > while (*(parent->Tcomm.load(std::memory_order_seq_cst)) !=
> > MPI_COMM_NULL); // block until connect has succeeded
> > }
> > pthread_exit(par);
> > return par;
> > }
> >
> >
> //------------------------------------------------------------------------------
> >
> > unsigned CommonBase::Connect(string svc)
> > // connects this process' MPI universe to a remote universe that has
> > published
> > // a name to access it by.
> > {
> > int error = MPI_SUCCESS;
> > // a server has its port already so can just open a comm
> > if (svc=="") Comms.push_back(*Tcomm.load(std::memory_order_seq_cst));
> > else // clients need to look up the service name
> > {
> >    MPI_Comm newcomm;
> >    char port[MPI_MAX_PORT_NAME];
> >    // Get the published port for the service name asked for.
> >    // Exit if we don't get a port, probably because the remote universe
> > isn't
> >    // initialised yet (we can always retry).
> >    if (error = MPI_Lookup_name(svc.c_str(),MPI_INFO_NULL,port)) return
> > error;
> >    // now try to establish the connection itself. Again, we can always
> > retry.
> >    if (error =
> > MPI_Comm_connect(port,MPI_INFO_NULL,0,MPI_COMM_WORLD,&newcomm)) return
> > error;
> >    Comms.push_back(newcomm); // as long as we succeeded, add to the list
> > of comms
> > }
> > int rUsize;
> > MPI_Comm_remote_size(Comms.back(), &rUsize);
> > Usize.push_back(rUsize);       // record the size of the remote universe
> > FnMapx.push_back(new FnMap_t); // give the new comm some function tables
> > to use
> > pPmap.push_back(new ProcMap(this));  // and a new processor map for the
> > remote group
> > PMsg_p prMsg;
> > SendPMap(Comms.back(), &prMsg);        // Send our process data to the
> > remote group
> > int fIdx=FnMapx.size()-1;
> > // populate the new function table with the global functions
> > (*FnMapx[fIdx])[Msg_p::KEY(Q::EXIT                )] =
> &CommonBase::OnExit;
> > (*FnMapx[fIdx])[Msg_p::KEY(Q::PMAP                )] =
> &CommonBase::OnPmap;
> > (*FnMapx[fIdx])[Msg_p::KEY(Q::SYST,Q::PING,Q::ACK )] =
> > &CommonBase::OnSystPingAck;
> > (*FnMapx[fIdx])[Msg_p::KEY(Q::SYST,Q::PING,Q::REQ )] =
> > &CommonBase::OnSystPingReq;
> > (*FnMapx[fIdx])[Msg_p::KEY(Q::TEST,Q::FLOO        )] =
> > &CommonBase::OnTestFloo;
> > if (svc=="") *Tcomm.load(std::memory_order_seq_cst) = MPI_COMM_NULL;  //
> > release any Accept comm.
> > return error;
> > }
> >
> >
> //------------------------------------------------------------------------------
> >
> > unsigned CommonBase::OnExit(PMsg_p * Z,unsigned cIdx)
> > // Do not post anything further here - the LogServer may have already
> gone
> > {
> > AcceptConns.store(false,std::memory_order_relaxed); // stop accepting
> > connections
> > if (acpt_running)
> > {
> >    printf("(%s)::CommonBase::OnExit closing down Accept MPI
> > request\n",Sderived.c_str());
> >    fflush(stdout);
> >    // have to close the Accept thread via a matching MPI_Comm_connect
> >    // because the MPI interface has made no provision for a nonblocking
> > accept, thus
> >    // otherwise the Accept thread will block forever waiting for a
> > message that will
> >    // never come because we are shutting down. See Gropp, et al. "Using
> > Advanced MPI"
> >    MPI_Comm dcomm;
> >
> >
> MPI_Comm_connect(MPIPort.load(std::memory_order_seq_cst),MPI_INFO_NULL,Lrank.load(std::memory_order_relaxed),MPI_COMM_WORLD,&dcomm);
> >    pthread_join(MPI_accept,NULL);
> >    acpt_running = false;
> > }
> > if (Urank == Lrank.load(std::memory_order_relaxed))
> > {
> >
> >
> MPI_Unpublish_name(MPISvc,MPI_INFO_NULL,MPIPort.load(std::memory_order_seq_cst));
> >    MPI_Close_port(MPIPort.load(std::memory_order_seq_cst));
> > }
> > return 1;
> > }
> >
> >
> >
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> >
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20181026/f5cace8e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MinimalTCollectiveTest.zip
Type: application/zip
Size: 22424 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20181026/f5cace8e/attachment.zip>


More information about the discuss mailing list