<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Which version of MPICH are you using?<div class=""><br class=""></div><div class="">Which fault tolerance features are you using? Fault tolerance is currently undergoing some changes and has different features than it used to have.</div><div class=""><br class=""></div><div class="">AFAIK, neither version of FT has been tested with ch3:sock. It’s possible that it will work, but FT is still a very experimental feature and hasn’t been widely tested.</div><div class=""><br class=""></div><div class="">If you want to avoid polling more, you can use non-blocking receive calls to post a receive and poll the system yourself periodically (using MPI_TEST). This will give your application an opportunity to do something else while waiting for the receives to complete.</div><div class=""><br class=""></div><div class="">Thanks,</div><div class="">Wesley</div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Sep 22, 2014, at 8:30 AM, Anatoly G <<a href="mailto:anatolyrishon@gmail.com" class="">anatolyrishon@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Dear MPICH.<div class="">I have a problem with poling MPICH mechanism.</div><div class="">I'm working on cluster. There are 2-4 processes on each computer (I can't execute single process per computer because of application requirements).</div><div class="">My system has 2 states:</div><div class="">Ready - slaves listen to master (but no data flow)</div><div class="">Run - masters start communication, then there is data flow.</div><div class="">When system in ready state (all processes except master executed MPI_Recv requests on master) but Master process still net sending data I see CPU usage > 100% (more than 1 core used) per process. When 4 processes are in ready state (waiting for data) computer begins to slow down other processes, I think because of polling.</div><div class="">I tried to build MPICH with <span style="font-family: arial, sans-serif; font-size: 12.666666984558105px;" class=""> </span><span style="font-family: arial, sans-serif; font-size: 12.666666984558105px;" class="">--with-device=ch3:sock, then I get 0% CPU usage in ready state, but I have a problem with Fault tolerance feature.</span></div><div class=""><span style="font-family: arial, sans-serif; font-size: 12.666666984558105px;" class="">My questions are:</span></div><div class=""><font face="arial, sans-serif" class=""><span style="font-size:12.6666669845581px" class="">1) Is such behavior expected that build with </span></font><span style="font-family: arial, sans-serif; font-size: 12.666666984558105px;" class="">--with-device=ch3:sock causes Fault tolerance not work? Does Fault tolerance based on polling mechanism?</span></div><div class=""><span style="font-family: arial, sans-serif; font-size: 12.666666984558105px;" class="">2) Can I change polling rate to reduce CPU payload? I understand that penalty is transfer rate slow down.</span></div><div class=""><span style="font-family: arial, sans-serif; font-size: 12.666666984558105px;" class="">3) Can I use any other MPI APIs to check if message from master is arrived w/o activating polling mechanism?</span></div><div class=""><span style="font-family: arial, sans-serif; font-size: 12.666666984558105px;" class=""><br class=""></span></div><div class=""><span style="font-family: arial, sans-serif; font-size: 12.666666984558105px;" class="">Regards,</span></div><div class=""><span style="font-family: arial, sans-serif; font-size: 12.666666984558105px;" class="">Anatoly. </span></div><div class=""><span style="font-family: arial, sans-serif; font-size: 12.666666984558105px;" class=""><br class=""></span></div></div><div class="gmail_extra"><br class=""><div class="gmail_quote">On Thu, May 8, 2014 at 3:57 PM, Balaji, Pavan <span dir="ltr" class=""><<a href="mailto:balaji@anl.gov" target="_blank" class="">balaji@anl.gov</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br class="">
This is expected. Currently, the only way to not have MPICH poll is to configure with --with-device=ch3:sock. Please note that this can cause performance loss (the polling is helpful for performance in the common case).<br class="">
<br class="">
We are planning to allow this in the default build as well in the future.<br class="">
<br class="">
— Pavan<br class="">
<div class=""><div class="h5"><br class="">
On May 8, 2014, at 7:54 AM, Anatoly G <<a href="mailto:anatolyrishon@gmail.com" class="">anatolyrishon@gmail.com</a>> wrote:<br class="">
<br class="">
> Dear MPICH forum.<br class="">
> I created an endless MPI program.<br class="">
> In this program each process calls MPI_Recv from other process, w/o any MPI_Send.<br class="">
> When I execute this program I see each process takes ~ 100% CPU core.<br class="">
> Is this behavior (I suppose polling) is normal?<br class="">
> May I reduce MPI_Recv CPU penalty?<br class="">
><br class="">
> Regards,<br class="">
> Anatoly.<br class="">
</div></div>> <mpi_polling.cpp>_______________________________________________<br class="">
> discuss mailing list <a href="mailto:discuss@mpich.org" class="">discuss@mpich.org</a><br class="">
> To manage subscription options or unsubscribe:<br class="">
> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank" class="">https://lists.mpich.org/mailman/listinfo/discuss</a><br class="">
<br class="">
_______________________________________________<br class="">
discuss mailing list <a href="mailto:discuss@mpich.org" class="">discuss@mpich.org</a><br class="">
To manage subscription options or unsubscribe:<br class="">
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank" class="">https://lists.mpich.org/mailman/listinfo/discuss</a><br class="">
</blockquote></div><br class=""></div>
_______________________________________________<br class="">discuss mailing list <a href="mailto:discuss@mpich.org" class="">discuss@mpich.org</a><br class="">To manage subscription options or unsubscribe:<br class=""><a href="https://lists.mpich.org/mailman/listinfo/discuss" class="">https://lists.mpich.org/mailman/listinfo/discuss</a></div></blockquote></div><br class=""></div></body></html>