[mpich-discuss] Does MPICH implements polling mechanism on MPI_Recv?

Anatoly G anatolyrishon at gmail.com
Mon Sep 22 08:30:59 CDT 2014


Dear MPICH.
I have a problem with poling MPICH mechanism.
I'm working on cluster. There are 2-4 processes on each computer (I can't
execute single process per computer because of application requirements).
My system has 2 states:
Ready - slaves listen to master (but no data flow)
Run - masters start communication, then there is data flow.
When system in ready state (all processes except master executed MPI_Recv
requests on master) but Master process still net sending data I see CPU
usage > 100% (more than 1 core used) per process. When 4 processes are in
ready state (waiting for data) computer begins to slow down other
processes, I think because of polling.
I tried to build MPICH with  --with-device=ch3:sock, then I get 0% CPU
usage in ready state, but I have a problem with Fault tolerance feature.
My questions are:
1) Is such behavior expected that build with --with-device=ch3:sock causes
Fault tolerance not work? Does Fault tolerance based on polling mechanism?
2) Can I change polling rate to reduce CPU payload? I understand that
penalty is transfer rate slow down.
3) Can I use any other MPI APIs to check if message from master is arrived
w/o activating polling mechanism?

Regards,
Anatoly.


On Thu, May 8, 2014 at 3:57 PM, Balaji, Pavan <balaji at anl.gov> wrote:

>
> This is expected.  Currently, the only way to not have MPICH poll is to
> configure with --with-device=ch3:sock.  Please note that this can cause
> performance loss (the polling is helpful for performance in the common
> case).
>
> We are planning to allow this in the default build as well in the future.
>
>   — Pavan
>
> On May 8, 2014, at 7:54 AM, Anatoly G <anatolyrishon at gmail.com> wrote:
>
> > Dear MPICH forum.
> > I created an endless MPI program.
> > In this program each process calls MPI_Recv from other process, w/o any
> MPI_Send.
> > When I execute this program I see each process takes ~ 100% CPU core.
> > Is this behavior (I suppose polling) is normal?
> > May I reduce MPI_Recv CPU penalty?
> >
> > Regards,
> > Anatoly.
> > <mpi_polling.cpp>_______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140922/27bfa688/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list