[mpich-discuss] Does MPICH implements polling mechanism on MPI_Recv?

Tue Sep 23 09:38:51 CDT 2014

> On Sep 22, 2014, at 11:42 PM, Anatoly G <anatolyrishon at gmail.com> wrote:
> 
> Thank you very much.
> 
> I'm using MPICH 3.1.
> 
> The application purpose is continue execution until at least Master process is live. The assumption is that processes can fail, but no more than single failure at a time. In this case survived processes must continue "like a Terminator". (-:
> In run-time I can't execute "manually" MPI_Test on all requests, it's too heavy. I have a network traffic ~ 400-500 Mb/s.
> 
> Implementation:
> I set -disable-auto-cleanup flag to activate Fault tolerance.
> All my processes activate MPI_Irecv on other processes and then execute MPI_Waitany on all active requests. When data arrives correctly, I process it and execute new MPI_Irecv on source process. If MPI_Waitany returns error (some process failed) I recognize failed rank and stop communication with it on application level (no more Sends & Recvs to it). In this mode system continues execution with survived processes. I don't use any collective operations, I simulate them using MPI_Irecv & MPI_Isend + MPI_Waitany or MPI_Waitall (returns error if some process failed).
> 
> I think it's ugly solution, but I can't think on any more elegant solution.
> Any other solution at all would be welcome.

You could use MPI_Testall/any if you’d like, but of course that may not fit your application usage. Overall, I don’t see an issue with your model.

> 
> The problem in "ready" state, when there is almost only waiting 3-4 messages may be received by process. In this state all CPUs are busy by executing polling mechanism.

That’s to be expected. While waiting for messages using MPI_Wait, it’s expected that nothing else is happening (unless you’re using threads as well, which it appears that you aren’t). That puts MPI into a busy wait to receive the message. As I mentioned previously, if you want to poll less frequently, you’ll have to use some flavor of MPI_Test.

Out of curiosity, what is the issue with using 100% of the CPU? If you’re not using it for your application (which it appears that you aren’t since you’re calling MPI_Wait), what difference does it make if MPI uses all of it?

Thanks,
Wesley

> 
> 
> 
> Regards,
> Anatoly.
> 
> 
> On Mon, Sep 22, 2014 at 4:46 PM, Wesley Bland <wbland at anl.gov <mailto:wbland at anl.gov>> wrote:
> Which version of MPICH are you using?
> 
> Which fault tolerance features are you using? Fault tolerance is currently undergoing some changes and has different features than it used to have.
> 
> AFAIK, neither version of FT has been tested with ch3:sock. It’s possible that it will work, but FT is still a very experimental feature and hasn’t been widely tested.
> 
> If you want to avoid polling more, you can use non-blocking receive calls to post a receive and poll the system yourself periodically (using MPI_TEST). This will give your application an opportunity to do something else while waiting for the receives to complete.
> 
> Thanks,
> Wesley
> 
>> On Sep 22, 2014, at 8:30 AM, Anatoly G <anatolyrishon at gmail.com <mailto:anatolyrishon at gmail.com>> wrote:
>> 
>> Dear MPICH.
>> I have a problem with poling MPICH mechanism.
>> I'm working on cluster. There are 2-4 processes on each computer (I can't execute single process per computer because of application requirements).
>> My system has 2 states:
>> Ready - slaves listen to master (but no data flow)
>> Run - masters start communication, then there is data flow.
>> When system in ready state (all processes except master executed MPI_Recv requests on master) but Master process still net sending data I see CPU usage > 100% (more than 1 core used) per process. When 4 processes are in ready state (waiting for data) computer begins to slow down other processes, I think because of polling.
>> I tried to build MPICH with  --with-device=ch3:sock, then I get 0% CPU usage in ready state, but I have a problem with Fault tolerance feature.
>> My questions are:
>> 1) Is such behavior expected that build with --with-device=ch3:sock causes Fault tolerance not work? Does Fault tolerance based on polling mechanism?
>> 2) Can I change polling rate to reduce CPU payload? I understand that penalty is transfer rate slow down.
>> 3) Can I use any other MPI APIs to check if message from master is arrived w/o activating polling mechanism?
>> 
>> Regards,
>> Anatoly. 
>> 
>> 
>> On Thu, May 8, 2014 at 3:57 PM, Balaji, Pavan <balaji at anl.gov <mailto:balaji at anl.gov>> wrote:
>> 
>> This is expected.  Currently, the only way to not have MPICH poll is to configure with --with-device=ch3:sock.  Please note that this can cause performance loss (the polling is helpful for performance in the common case).
>> 
>> We are planning to allow this in the default build as well in the future.
>> 
>>   — Pavan
>> 
>> On May 8, 2014, at 7:54 AM, Anatoly G <anatolyrishon at gmail.com <mailto:anatolyrishon at gmail.com>> wrote:
>> 
>> > Dear MPICH forum.
>> > I created an endless MPI program.
>> > In this program each process calls MPI_Recv from other process, w/o any MPI_Send.
>> > When I execute this program I see each process takes ~ 100% CPU core.
>> > Is this behavior (I suppose polling) is normal?
>> > May I reduce MPI_Recv CPU penalty?
>> >
>> > Regards,
>> > Anatoly.
>> > <mpi_polling.cpp>_______________________________________________
>> > discuss mailing list     discuss at mpich.org <mailto:discuss at mpich.org>
>> > To manage subscription options or unsubscribe:
>> > https://lists.mpich.org/mailman/listinfo/discuss <https://lists.mpich.org/mailman/listinfo/discuss>
>> 
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org <mailto:discuss at mpich.org>
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss <https://lists.mpich.org/mailman/listinfo/discuss>
>> 
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org <mailto:discuss at mpich.org>
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss <https://lists.mpich.org/mailman/listinfo/discuss>
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org <mailto:discuss at mpich.org>
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss <https://lists.mpich.org/mailman/listinfo/discuss>
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140923/439bf155/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss