<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jun 7, 2018 at 4:39 PM, Lisandro Dalcin <span dir="ltr"><<a href="mailto:dalcinl@gmail.com" target="_blank">dalcinl@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Though I acknowledge my solution is not very elegant, I had to find a<br>

workaround for a async execution Python framework I wrote<br>

(mpi4py.futures), and eventually decided to go with sleep() calls with<br>

exponential backoff. I have no control and don't want users to be<br>

forced to switch to ch3:sock, and this behavior is the common one in<br>

other MPIs out there.<br>

<br>

@Ali, if you can read Python, you may find some inspiration here:<br>

<a href="https://bitbucket.org/mpi4py/mpi4py/src/master/src/mpi4py/futures/_lib.py" rel="noreferrer" target="_blank">https://bitbucket.org/mpi4py/<wbr>mpi4py/src/master/src/mpi4py/<wbr>futures/_lib.py</a><br>

This certainly adds latency, but you can somehow decide how much<br>

(well, up to the accuracy of your kernel timer slack) and you use it<br>

where really really needed.<br></blockquote><div><br></div><div>I am not fluent in Python but you are just using nonblocking calls and testing them carefully, right?  That's what I'd implement if I wanted to reduce spinning.</div><div><br></div><div>A more exotic implementation could do something along the lines of Casper and offload progress to a different process, which would allow the calling process to block on a interprocess semaphore or something like that, but it might cause problems with threads in that process if the OS tries to put the whole process to sleep.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

@Jeff, if you have a better implementation to suggest, please let us<br>

know (maybe off-list, this is unrelated to MPICH development).<br></blockquote><div><br></div><div><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">As blocking poll is largely incompatible with low-latency and shared-memory protocols, I don't think there is any implementation that is going to do a good job at this, since it would not be very appealing to the majority of MPI users.  The PETSc folks appear to be the biggest proponents of blocking poll (solely for purposes of running dozens of MPI processes on their laptops, it seems) and they seem to prefer ch3:sock.  I defer to their experience as to whether a better implementation exists.</span><br></div><div><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"><br></span></div><div><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">Jeff</span></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div><div class="h5"><br>

On Fri, 8 Jun 2018 at 01:00, Jeff Hammond <<a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a>> wrote:<br>

><br>

> It spins because that is optimal for latency and how the shared-memory protocols work.  If you want blocking semantics, use ch3:sock, which will park the calling thread in the kernel.  It is great for oversubscription but terrible for performance in the common case of exact subscription or undersubscription.<br>

><br>

> You can't save much power unless you drop into lower P/C-states, but the states that save you significant power will increase the latency a huge amount.  Dell did something a while back that turned down the frequency during MPI calls (<a href="http://www.hpcadvisorycouncil.com/events/2013/Spain-Workshop/pdf/5_Dell.pdf" rel="noreferrer" target="_blank">http://www.<wbr>hpcadvisorycouncil.com/events/<wbr>2013/Spain-Workshop/pdf/5_<wbr>Dell.pdf</a>), which saved a bit of power.<br>

><br>

> Jeff<br>

><br>

> On Thu, Jun 7, 2018 at 4:27 AM, Ali MoradiAlamdarloo <<a href="mailto:timndus@gmail.com">timndus@gmail.com</a>> wrote:<br>

>><br>

>> Dear all,<br>

>><br>

>> The blocking call definition from my understanding is something like this:<br>

>> when a process(P0) do a blocking system call, scheduler block the process and assign another process(P1) in order to efficiently use of CPU core. Finally P0 response will be ready and scheduler can map it again on a core.<br>

>><br>

>> But this is not what happening In MPICH->MPI_Recv function. you call it BLOCKING call, but the process that call this function actually doesn't block, it just continue running on core WAITING for his response.<br>

>><br>

>> Why you decide to do this? why we have a process waiting on a valuable processing core and burning the power?<br>

>><br>

>> ______________________________<wbr>_________________<br>

>> To manage subscription options or unsubscribe:<br>

>> <a href="https://lists.mpich.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">https://lists.mpich.org/<wbr>mailman/listinfo/devel</a><br>

>><br>

><br>

><br>

><br>

> --<br>

> Jeff Hammond<br>

> <a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a><br>

> <a href="http://jeffhammond.github.io/" rel="noreferrer" target="_blank">http://jeffhammond.github.io/</a><br>

> ______________________________<wbr>_________________<br>

> To manage subscription options or unsubscribe:<br>

> <a href="https://lists.mpich.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">https://lists.mpich.org/<wbr>mailman/listinfo/devel</a><br>

<br>

<br>

<br>

-- <br>

</div></div>Lisandro Dalcin<br>

============<br>

Research Scientist<br>

Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)<br>

Extreme Computing Research Center (ECRC)<br>

King Abdullah University of Science and Technology (KAUST)<br>

<a href="http://ecrc.kaust.edu.sa/" rel="noreferrer" target="_blank">http://ecrc.kaust.edu.sa/</a><br>

<br>

4700 King Abdullah University of Science and Technology<br>

al-Khawarizmi Bldg (Bldg 1), Office # 0109<br>

Thuwal 23955-6900, Kingdom of Saudi Arabia<br>

<a href="http://www.kaust.edu.sa" rel="noreferrer" target="_blank">http://www.kaust.edu.sa</a><br>

<br>

Office Phone: +966 12 808-0459<br>

<div class="HOEnZb"><div class="h5">______________________________<wbr>_________________<br>

To manage subscription options or unsubscribe:<br>

<a href="https://lists.mpich.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">https://lists.mpich.org/<wbr>mailman/listinfo/devel</a><br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">Jeff Hammond<br><a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a><br><a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a></div>

</div></div>