[mpich-discuss] Does MPICH implements polling mechanism on MPI_Recv?

Wesley Bland wbland at anl.gov
Mon Sep 22 08:46:29 CDT 2014


Which version of MPICH are you using?

Which fault tolerance features are you using? Fault tolerance is currently undergoing some changes and has different features than it used to have.

AFAIK, neither version of FT has been tested with ch3:sock. It’s possible that it will work, but FT is still a very experimental feature and hasn’t been widely tested.

If you want to avoid polling more, you can use non-blocking receive calls to post a receive and poll the system yourself periodically (using MPI_TEST). This will give your application an opportunity to do something else while waiting for the receives to complete.

Thanks,
Wesley

> On Sep 22, 2014, at 8:30 AM, Anatoly G <anatolyrishon at gmail.com> wrote:
> 
> Dear MPICH.
> I have a problem with poling MPICH mechanism.
> I'm working on cluster. There are 2-4 processes on each computer (I can't execute single process per computer because of application requirements).
> My system has 2 states:
> Ready - slaves listen to master (but no data flow)
> Run - masters start communication, then there is data flow.
> When system in ready state (all processes except master executed MPI_Recv requests on master) but Master process still net sending data I see CPU usage > 100% (more than 1 core used) per process. When 4 processes are in ready state (waiting for data) computer begins to slow down other processes, I think because of polling.
> I tried to build MPICH with  --with-device=ch3:sock, then I get 0% CPU usage in ready state, but I have a problem with Fault tolerance feature.
> My questions are:
> 1) Is such behavior expected that build with --with-device=ch3:sock causes Fault tolerance not work? Does Fault tolerance based on polling mechanism?
> 2) Can I change polling rate to reduce CPU payload? I understand that penalty is transfer rate slow down.
> 3) Can I use any other MPI APIs to check if message from master is arrived w/o activating polling mechanism?
> 
> Regards,
> Anatoly. 
> 
> 
> On Thu, May 8, 2014 at 3:57 PM, Balaji, Pavan <balaji at anl.gov <mailto:balaji at anl.gov>> wrote:
> 
> This is expected.  Currently, the only way to not have MPICH poll is to configure with --with-device=ch3:sock.  Please note that this can cause performance loss (the polling is helpful for performance in the common case).
> 
> We are planning to allow this in the default build as well in the future.
> 
>   — Pavan
> 
> On May 8, 2014, at 7:54 AM, Anatoly G <anatolyrishon at gmail.com <mailto:anatolyrishon at gmail.com>> wrote:
> 
> > Dear MPICH forum.
> > I created an endless MPI program.
> > In this program each process calls MPI_Recv from other process, w/o any MPI_Send.
> > When I execute this program I see each process takes ~ 100% CPU core.
> > Is this behavior (I suppose polling) is normal?
> > May I reduce MPI_Recv CPU penalty?
> >
> > Regards,
> > Anatoly.
> > <mpi_polling.cpp>_______________________________________________
> > discuss mailing list     discuss at mpich.org <mailto:discuss at mpich.org>
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss <https://lists.mpich.org/mailman/listinfo/discuss>
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org <mailto:discuss at mpich.org>
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss <https://lists.mpich.org/mailman/listinfo/discuss>
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140922/7f0ff0ff/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list