[mpich-devel] MPI_Recv, blocking call concept

Lisandro Dalcin dalcinl at gmail.com
Thu Jun 7 18:39:51 CDT 2018

Though I acknowledge my solution is not very elegant, I had to find a
workaround for a async execution Python framework I wrote
(mpi4py.futures), and eventually decided to go with sleep() calls with
exponential backoff. I have no control and don't want users to be
forced to switch to ch3:sock, and this behavior is the common one in
other MPIs out there.

@Ali, if you can read Python, you may find some inspiration here:
This certainly adds latency, but you can somehow decide how much
(well, up to the accuracy of your kernel timer slack) and you use it
where really really needed.

@Jeff, if you have a better implementation to suggest, please let us
know (maybe off-list, this is unrelated to MPICH development).

On Fri, 8 Jun 2018 at 01:00, Jeff Hammond <jeff.science at gmail.com> wrote:
> It spins because that is optimal for latency and how the shared-memory protocols work.  If you want blocking semantics, use ch3:sock, which will park the calling thread in the kernel.  It is great for oversubscription but terrible for performance in the common case of exact subscription or undersubscription.
> You can't save much power unless you drop into lower P/C-states, but the states that save you significant power will increase the latency a huge amount.  Dell did something a while back that turned down the frequency during MPI calls (http://www.hpcadvisorycouncil.com/events/2013/Spain-Workshop/pdf/5_Dell.pdf), which saved a bit of power.
> Jeff
> On Thu, Jun 7, 2018 at 4:27 AM, Ali MoradiAlamdarloo <timndus at gmail.com> wrote:
>> Dear all,
>> The blocking call definition from my understanding is something like this:
>> when a process(P0) do a blocking system call, scheduler block the process and assign another process(P1) in order to efficiently use of CPU core. Finally P0 response will be ready and scheduler can map it again on a core.
>> But this is not what happening In MPICH->MPI_Recv function. you call it BLOCKING call, but the process that call this function actually doesn't block, it just continue running on core WAITING for his response.
>> Why you decide to do this? why we have a process waiting on a valuable processing core and burning the power?
>> _______________________________________________
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/devel
> --
> Jeff Hammond
> jeff.science at gmail.com
> http://jeffhammond.github.io/
> _______________________________________________
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/devel

Lisandro Dalcin
Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia

Office Phone: +966 12 808-0459

More information about the devel mailing list