[mpich-discuss] How does MPICH dtermine available NICs?

Zhou, Hui zhouh at anl.gov
Mon Mar 9 10:24:03 CDT 2026


Could you try the upstream MPICH? - https://urldefense.us/v3/__https://www.mpich.org/downloads/__;!!G_uCfscf7eWS!eaZwgEeG7j4_wvBW7TJLfW7DwSTH3fFMgA78Zc-mx4Hi7gxIj8DBnm5vVNQraVIFszvRBY83cglV$ 

The behavior between Cray MPICH and upstream MPICH may have diverged. The upstream MPICH should be fine with different number of NICs on different nodes. By default, the process picks a nic that is closest to process's CPU affinity; or if you have multiple processes on the same node, each process will try pick a different nic in a round-robin fashion. Usually a disabled nic won't be selected during init, but if you see it behave otherwise (weird), I encourage you to create an issue at https://urldefense.us/v3/__https://github.com/pmodels/mpich/issues__;!!G_uCfscf7eWS!eaZwgEeG7j4_wvBW7TJLfW7DwSTH3fFMgA78Zc-mx4Hi7gxIj8DBnm5vVNQraVIFszvRBUPDkrZ_$  .


Cheers,
Hui
________________________________
From: Kevin Buckley via discuss <discuss at mpich.org>
Sent: Monday, March 9, 2026 12:22 AM
To: discuss at mpich.org <discuss at mpich.org>
Cc: Kevin Buckley <kevin.buckley.pawsey.org.au at gmail.com>
Subject: [mpich-discuss] How does MPICH dtermine available NICs?

I do hope this is the right list on which to ask this, but it all seems a bit weird to me, so I thought I'd "turn pro". TL;DR: it gets really weird at the bottom. I am trying to work out how MPICH determines the number of NICs that it "thinks"
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd

I do hope this is the right list on which to ask this, but
it all seems a bit weird to me, so I thought I'd "turn pro".

TL;DR: it gets really weird at the bottom.

I am trying to work out how MPICH determines the number of
NICs that it "thinks" a node has.

Here's what I am seeing, as a result of my nodes having an

   "Inconsistent number of NICs across the job"

PE 0:  == Node nid001764 has 1 NIC(s) available
PE 0:  == Node nid002792 has 2 NIC(s) available
PE 0:

(where the job is a noddy test job that runs two processes
  per node, across two nodes)

however, as far as I am concerned, the second NIC on that
nid002792 has been disabled at Boot-time, by replacing

  STARTMODE='auto'

with

  STARTMODE='off'

in the

  /etc/sysconfig/network/ifcfg-hsn1

file.

As far as wicked (which is really using an ifcfg-compat mode,
and not any new wicked-goodness) is concerned, the second
interface is "not up", and hasn't been configured, hence:

nid002792:~ # wicked ifstatus hsn0
hsn0            up
       link:     #5, state up, mtu 9000
       type:     ethernet, hwaddr 02:00:00:00:60:73
       config:   compat:suse:/etc/sysconfig/network/ifcfg-hsn0
       leases:   ipv4 static granted
       addr:     ipv4 10.253.133.14/17 [static]
       route:    ipv4 172.18.0.0/16 via 10.253.255.254 proto boot
nid002792:~ # wicked ifstatus hsn1
hsn1            device-unconfigured
       link:     #6, state up, mtu 1500
       type:     ethernet, hwaddr 02:00:00:00:60:33
nid002792:~ #

and what's more, there is no route using that second interface,
which there would have been, had I not ferkled the ifcfg script:

nid002792:~ #  ip route
default via 10.253.128.3 dev hsn0
10.168.28.0/22 dev bond0 proto kernel scope link src 10.168.28.23
10.253.128.0/17 dev hsn0 proto kernel scope link src 10.253.133.14
172.18.0.0/16 via 10.253.255.254 dev hsn0
172.23.0.0/16 via 10.168.31.254 dev bond0
nid002792:~ #


THE REALLY WEIRD BIT

If I don't disable the second NIC at boot time, but then, once the
node has booted, explicitly "ifdown" it, manually, with wicked's
wrapped version of an ifdown:

   wicked --systemd ifdown hsn1

then MPICH jobs lauching on the node DON'T SEE THE SECOND NIC?

For reference. here are the two "old-school" network config files

nid002792:~ # cat /etc/sysconfig/network/ifcfg-hsn0
STARTMODE='auto'
BOOTPROTO='static'
IPADDR='10.253.133.14'
NETMASK='255.255.128.0'
MTU='9000'
LINK_REQUIRED='yes'
POST_UP_SCRIPT="systemd:cm-slingshot-ama at .service"
nid002792:~ # cat /etc/sysconfig/network/ifcfg-hsn1
STARTMODE='off'
BOOTPROTO='static'
IPADDR='10.253.133.11'
NETMASK='255.255.128.0'
MTU='9000'
LINK_REQUIRED='yes'
POST_UP_SCRIPT="systemd:cm-slingshot-ama at .service"
nid002792:~ #


Now, "I would have thought" (tm) that MPICH, on "seeing"
an interface that hadn't been START-ed, would not have
considered it as a NIC, for NIC_SYMMETRY puposes?


I am (more than) aware that I can prevent the messages about
NIC_SYMMETRY inconsistencies, but that's not the issue here;
the issue here is that MPICH seems to think a NIC that hasn't
been START-ed is worthy of consideration.

FWIW, it's

   cray-mpich/8.1.32

so MPICH 3.4a2, under the hood, where the hood belongs to an
HPE/Cray EX, running SLES 15 SP6.

The info from the cray-mpich/8.1.32 module says

       - Cray MPICH offers support for multiple NICs per node. Starting with
         version 8.0.8, by default Cray MPICH will use all available NICs on
         a node.

but maybe their definiiton of "available" differs from the one
that I have become accustomed to over the years, to wit: if it's
not START-ed; it's not available?

Interested to hear any thoughts on this?

Kevin M. Buckley
--
Supercomputing Systems Administrator
Pawsey Supercomputing Centre
PERTH
Australia

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20260309/c9af3fce/attachment-0001.html>


More information about the discuss mailing list