[mpich-discuss] Affinity with MPICH_ASYNC_PROGRESS

Jeff Hammond jhammond at alcf.anl.gov
Mon Feb 25 09:08:57 CST 2013


>> I think the only way for async progress to work well is to have
>> fine-grain locking since of MPI, as is done in PAMID.  Any
>> implementation that resorts to fat-locking is probably better off
>> without async progress unless the application is doing something
>> really silly (like never calling MPI on a rank that is the target of
>> MPI RMA).
>
> If processes are entering the MPI progress engine frequently enough that
> they are competing with the comm thread to get the (per-process, not
> per-node) library lock, the additional comm thread is probably not needed.

Yep.  That's exactly what I see with NWChem.  I've never profiled it
with enough detail to prove that this is why NWChem runs faster
without comm threads, but it is entirely logical to me that this is
the reason.  If there was a way to instrument RMA to measure time
spent waiting on remote progress vs. time spent actually moving data,
that would help, but I imagine that is far from trivial to implement.

Is it the case that entering into MPI will lead to processing of all
packets in an incoming RMA op?  Does the receipt of the first packet
from an MPI_Accumulate cause the target to sit inside of the progress
engine until all packages in the message have arrived or does the
passive target merely process the packets that arrive while it is
inside of MPI and then return.  I'm sorry if this is a poorly posed
question.

Jeff

-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond



More information about the discuss mailing list