[mpich-discuss] libfabric+psm2 performance

Antonio Peña tpenya at gmail.com
Tue Jun 15 04:46:15 CDT 2021


Hi folks,

I'm setting up an MPICH over libfabric over psm2 for MareNostrum 
(Omni-Path), to try out some ideas.

I've compiled libfabric 1.5 (last one that compiles in this machine) 
over opa-psm2-11.2.185, and mpich-3.4.2 + mpich-4.0a1 in both ch3 and 
ch4 (yes 4 MPICH variants). There's only psm2 support in libfabric, so 
no danger of falling back to other providers. ldd confirms my libfabric 
is linked.

./fi_info
     provider: psm2
     fabric: psm2
     domain: psm2
     version: 1.5
     type: FI_EP_RDM
     protocol: FI_PROTO_PSMX2

I'm comparing 2-node pt2pt performance against impi/2017.4 using osu 
microbenchmarks.

While both fi_pingong and impi give me a max. BW of ~10 MB/s, all mpich 
versions stick at ~3 MB/s.

Is this expected? I mean, is there so much secret sauce in impi? Or, am 
likely doing something wrong?

I'm doing fairly plain configures, nothing fancy, e.g.:
   ./configure --prefix=... --with-device=ch4:ofi --with-libfabric=...

I'd appreciate some guidance  - my MPICH tweaking is a little rusted :)

Best,
   Toni


More information about the discuss mailing list