[mpich-discuss] Does MPICH implements polling mechanism on MPI_Recv?

Anatoly G anatolyrishon at gmail.com
Mon Sep 22 23:42:46 CDT 2014


Thank you very much.

I'm using MPICH 3.1.

The application purpose is continue execution until at least Master process
is live. The assumption is that processes can fail, but no more than single
failure at a time. In this case survived processes must continue "like a
Terminator". (-:
In run-time I can't execute "manually" MPI_Test on all requests, it's too
heavy. I have a network traffic ~ 400-500 Mb/s.

Implementation:
I set -disable-auto-cleanup flag to activate Fault tolerance.
All my processes activate MPI_Irecv on other processes and then execute
MPI_Waitany on all active requests. When data arrives correctly, I process
it and execute new MPI_Irecv on source process. If MPI_Waitany returns
error (some process failed) I recognize failed rank and stop communication
with it on application level (no more Sends & Recvs to it). In this mode
system continues execution with survived processes. I don't use any
collective operations, I simulate them using MPI_Irecv & MPI_Isend +
MPI_Waitany or MPI_Waitall (returns error if some process failed).

I think it's ugly solution, but I can't think on any more elegant solution.
Any other solution at all would be welcome.

The problem in "ready" state, when there is almost only waiting 3-4
messages may be received by process. In this state all CPUs are busy by
executing polling mechanism.



Regards,
Anatoly.


On Mon, Sep 22, 2014 at 4:46 PM, Wesley Bland <wbland at anl.gov> wrote:

> Which version of MPICH are you using?
>
> Which fault tolerance features are you using? Fault tolerance is currently
> undergoing some changes and has different features than it used to have.
>
> AFAIK, neither version of FT has been tested with ch3:sock. It’s possible
> that it will work, but FT is still a very experimental feature and hasn’t
> been widely tested.
>
> If you want to avoid polling more, you can use non-blocking receive calls
> to post a receive and poll the system yourself periodically (using
> MPI_TEST). This will give your application an opportunity to do something
> else while waiting for the receives to complete.
>
> Thanks,
> Wesley
>
> On Sep 22, 2014, at 8:30 AM, Anatoly G <anatolyrishon at gmail.com> wrote:
>
> Dear MPICH.
> I have a problem with poling MPICH mechanism.
> I'm working on cluster. There are 2-4 processes on each computer (I can't
> execute single process per computer because of application requirements).
> My system has 2 states:
> Ready - slaves listen to master (but no data flow)
> Run - masters start communication, then there is data flow.
> When system in ready state (all processes except master executed MPI_Recv
> requests on master) but Master process still net sending data I see CPU
> usage > 100% (more than 1 core used) per process. When 4 processes are in
> ready state (waiting for data) computer begins to slow down other
> processes, I think because of polling.
> I tried to build MPICH with  --with-device=ch3:sock, then I get 0% CPU
> usage in ready state, but I have a problem with Fault tolerance feature.
> My questions are:
> 1) Is such behavior expected that build with --with-device=ch3:sock
> causes Fault tolerance not work? Does Fault tolerance based on polling
> mechanism?
> 2) Can I change polling rate to reduce CPU payload? I understand that
> penalty is transfer rate slow down.
> 3) Can I use any other MPI APIs to check if message from master is arrived
> w/o activating polling mechanism?
>
> Regards,
> Anatoly.
>
>
> On Thu, May 8, 2014 at 3:57 PM, Balaji, Pavan <balaji at anl.gov> wrote:
>
>>
>> This is expected.  Currently, the only way to not have MPICH poll is to
>> configure with --with-device=ch3:sock.  Please note that this can cause
>> performance loss (the polling is helpful for performance in the common
>> case).
>>
>> We are planning to allow this in the default build as well in the future.
>>
>>   — Pavan
>>
>> On May 8, 2014, at 7:54 AM, Anatoly G <anatolyrishon at gmail.com> wrote:
>>
>> > Dear MPICH forum.
>> > I created an endless MPI program.
>> > In this program each process calls MPI_Recv from other process, w/o any
>> MPI_Send.
>> > When I execute this program I see each process takes ~ 100% CPU core.
>> > Is this behavior (I suppose polling) is normal?
>> > May I reduce MPI_Recv CPU penalty?
>> >
>> > Regards,
>> > Anatoly.
>> > <mpi_polling.cpp>_______________________________________________
>> > discuss mailing list     discuss at mpich.org
>> > To manage subscription options or unsubscribe:
>> > https://lists.mpich.org/mailman/listinfo/discuss
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140923/b5385fe9/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list