[mpich-discuss] How does MPICH dtermine available NICs?

Raymond, Michael mraymond at hpe.com
Tue Mar 10 10:34:52 CDT 2026


  The NIC symmetry test is specific to HPE Cray MPICH.

  As a workaround, try setting MPICH_OFI_NUM_NICS=1.

—
Michael Raymond
HPE HPC programming environment
From: Kevin Buckley via discuss <discuss at mpich.org>
Date: Tuesday, March 10, 2026 at 00:47
To: discuss at mpich.org <discuss at mpich.org>
Cc: Kevin Buckley <kevin.buckley.pawsey.org.au at gmail.com>
Subject: Re: [mpich-discuss] How does MPICH dtermine available NICs?

This Message Is From an External Sender
This message came from outside your organization.


On 2026/03/09 23:24, Zhou, Hui wrote:
> Could you try the upstream MPICH?

I followed you suggestion to try the upstream MPICH.

Trying to configure 3.4a2, the version "inside" the cray-mpich,
modulo any vendor updates made to that base release, saw

...
configure: error: The Fortran compiler gfortran will not compile files that\
  call the same routine with arguments of different types.
$
$ which gfortran
/opt/cray/pe/gcc-native/14/bin/gfortran
$


Wasn't sure if that meant that the MPICH code is too old to
make correct compiler checks, or that the compiler is too
new to respond them - so I shelved that line of attack.


Was able to configure and build 5.0.0, and have used that to
compile and run the same noddy "connectivity_c.c" example code
that I had been seeing the issue with.

It all seems to work without any warnings (as does compiling/
running within an OpenMPI 5.0.3 environment), in that a 2-node
job spanning a node with just one NIC, and the dual-NIC node
with one NIC disabled, completes, as does a 2-node job spanning
the node with just one NIC, and a dual-NIC node with neither
NIC disabled.

Would you though, expect the vanilla MPICH 5.0.0 to have
warned me about the NIC_SYMMETRY inconsistencies?

Your comment in the original reply,

> The upstream MPICH should be fine with different number of
> NICs on different nodes. By default, the process picks a
> nic that is closest to process's CPU affinity; or if you
> have multiple processes on the same node, each process
> will try pick a different nic in a round-robin fashion.
> Usually a disabled nic won't be selected during init,

would suggest that "later" (than cray-mpich's 3.4a2 starting
point, plus whatver enhancements Cray and/or HPE have since
added) MPICH-s would not bother to give the warning, as those
versions would "just do the right thing" (tm)?

If so,  think it's sounding as though the consideration of
the disabled-at-boot-time NIC as somehow being "available"
for MPI traffic, is tied into something that the vendor is
doing: somewhere within their overall MPI+Slingshot stack.

Thanks again for fielding this one,
Kevin M. Buckley
--
Supercomputing Systems Administrator
Pawsey Supercomputing Centre
PERTH
Australia

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20260310/9fc82c43/attachment-0001.html>


More information about the discuss mailing list