[mpich-devel] fastest shared-memory back-end?
Jeff Hammond
jeff.science at gmail.com
Sat Dec 14 18:51:15 CST 2019
MPICH configure just asked me to choose ch4:ofi vs ch4:ucx vs ch3. Does
anyone have an informed opinion on which one is the faster for
shared-memory execution?
My specific use case is NWChem on very large multi-socket Xeon nodes, where
passive target RMA is the primary communication method, hence my primary
concern is asynchronous progress and lack of serialization in RMA. In the
past, I have observed significant performance issues due to serialization
of RMA accumulate operations acting on non-overlapping memory regions.
Jeff
--
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/devel/attachments/20191214/63f6bdc2/attachment.html>
More information about the devel
mailing list