[mpich-discuss] How does MPICH dtermine available NICs?
Zhou, Hui
zhouh at anl.gov
Mon Mar 9 10:24:03 CDT 2026
Could you try the upstream MPICH? - https://urldefense.us/v3/__https://www.mpich.org/downloads/__;!!G_uCfscf7eWS!eaZwgEeG7j4_wvBW7TJLfW7DwSTH3fFMgA78Zc-mx4Hi7gxIj8DBnm5vVNQraVIFszvRBY83cglV$
The behavior between Cray MPICH and upstream MPICH may have diverged. The upstream MPICH should be fine with different number of NICs on different nodes. By default, the process picks a nic that is closest to process's CPU affinity; or if you have multiple processes on the same node, each process will try pick a different nic in a round-robin fashion. Usually a disabled nic won't be selected during init, but if you see it behave otherwise (weird), I encourage you to create an issue at https://urldefense.us/v3/__https://github.com/pmodels/mpich/issues__;!!G_uCfscf7eWS!eaZwgEeG7j4_wvBW7TJLfW7DwSTH3fFMgA78Zc-mx4Hi7gxIj8DBnm5vVNQraVIFszvRBUPDkrZ_$ .
Cheers,
Hui
________________________________
From: Kevin Buckley via discuss <discuss at mpich.org>
Sent: Monday, March 9, 2026 12:22 AM
To: discuss at mpich.org <discuss at mpich.org>
Cc: Kevin Buckley <kevin.buckley.pawsey.org.au at gmail.com>
Subject: [mpich-discuss] How does MPICH dtermine available NICs?
I do hope this is the right list on which to ask this, but it all seems a bit weird to me, so I thought I'd "turn pro". TL;DR: it gets really weird at the bottom. I am trying to work out how MPICH determines the number of NICs that it "thinks"
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
ZjQcmQRYFpfptBannerEnd
I do hope this is the right list on which to ask this, but
it all seems a bit weird to me, so I thought I'd "turn pro".
TL;DR: it gets really weird at the bottom.
I am trying to work out how MPICH determines the number of
NICs that it "thinks" a node has.
Here's what I am seeing, as a result of my nodes having an
"Inconsistent number of NICs across the job"
PE 0: == Node nid001764 has 1 NIC(s) available
PE 0: == Node nid002792 has 2 NIC(s) available
PE 0:
(where the job is a noddy test job that runs two processes
per node, across two nodes)
however, as far as I am concerned, the second NIC on that
nid002792 has been disabled at Boot-time, by replacing
STARTMODE='auto'
with
STARTMODE='off'
in the
/etc/sysconfig/network/ifcfg-hsn1
file.
As far as wicked (which is really using an ifcfg-compat mode,
and not any new wicked-goodness) is concerned, the second
interface is "not up", and hasn't been configured, hence:
nid002792:~ # wicked ifstatus hsn0
hsn0 up
link: #5, state up, mtu 9000
type: ethernet, hwaddr 02:00:00:00:60:73
config: compat:suse:/etc/sysconfig/network/ifcfg-hsn0
leases: ipv4 static granted
addr: ipv4 10.253.133.14/17 [static]
route: ipv4 172.18.0.0/16 via 10.253.255.254 proto boot
nid002792:~ # wicked ifstatus hsn1
hsn1 device-unconfigured
link: #6, state up, mtu 1500
type: ethernet, hwaddr 02:00:00:00:60:33
nid002792:~ #
and what's more, there is no route using that second interface,
which there would have been, had I not ferkled the ifcfg script:
nid002792:~ # ip route
default via 10.253.128.3 dev hsn0
10.168.28.0/22 dev bond0 proto kernel scope link src 10.168.28.23
10.253.128.0/17 dev hsn0 proto kernel scope link src 10.253.133.14
172.18.0.0/16 via 10.253.255.254 dev hsn0
172.23.0.0/16 via 10.168.31.254 dev bond0
nid002792:~ #
THE REALLY WEIRD BIT
If I don't disable the second NIC at boot time, but then, once the
node has booted, explicitly "ifdown" it, manually, with wicked's
wrapped version of an ifdown:
wicked --systemd ifdown hsn1
then MPICH jobs lauching on the node DON'T SEE THE SECOND NIC?
For reference. here are the two "old-school" network config files
nid002792:~ # cat /etc/sysconfig/network/ifcfg-hsn0
STARTMODE='auto'
BOOTPROTO='static'
IPADDR='10.253.133.14'
NETMASK='255.255.128.0'
MTU='9000'
LINK_REQUIRED='yes'
POST_UP_SCRIPT="systemd:cm-slingshot-ama at .service"
nid002792:~ # cat /etc/sysconfig/network/ifcfg-hsn1
STARTMODE='off'
BOOTPROTO='static'
IPADDR='10.253.133.11'
NETMASK='255.255.128.0'
MTU='9000'
LINK_REQUIRED='yes'
POST_UP_SCRIPT="systemd:cm-slingshot-ama at .service"
nid002792:~ #
Now, "I would have thought" (tm) that MPICH, on "seeing"
an interface that hadn't been START-ed, would not have
considered it as a NIC, for NIC_SYMMETRY puposes?
I am (more than) aware that I can prevent the messages about
NIC_SYMMETRY inconsistencies, but that's not the issue here;
the issue here is that MPICH seems to think a NIC that hasn't
been START-ed is worthy of consideration.
FWIW, it's
cray-mpich/8.1.32
so MPICH 3.4a2, under the hood, where the hood belongs to an
HPE/Cray EX, running SLES 15 SP6.
The info from the cray-mpich/8.1.32 module says
- Cray MPICH offers support for multiple NICs per node. Starting with
version 8.0.8, by default Cray MPICH will use all available NICs on
a node.
but maybe their definiiton of "available" differs from the one
that I have become accustomed to over the years, to wit: if it's
not START-ed; it's not available?
Interested to hear any thoughts on this?
Kevin M. Buckley
--
Supercomputing Systems Administrator
Pawsey Supercomputing Centre
PERTH
Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20260309/c9af3fce/attachment-0001.html>
More information about the discuss
mailing list