<!DOCTYPE html>
<!-- BaNnErBlUrFlE-BoDy-start -->
<!-- Preheader Text : BEGIN -->
<div style="display:none !important;display:none;visibility:hidden;mso-hide:all;font-size:1px;color:#ffffff;line-height:1px;height:0px;max-height:0px;opacity:0;overflow:hidden;">
On 2026/03/11 10: 43, McMahon, Kim wrote: > > Cray MPICH detects "available" NICS on a node by calling the libfabric fi_getinto() call. > .. . > I suspect your method for disabling the NIC via "STARTMODE='off'" does not fail these</div>
<!-- Preheader Text : END -->
<!-- Email Banner : BEGIN -->
<div style="display:none !important;display:none;visibility:hidden;mso-hide:all;font-size:1px;color:#ffffff;line-height:1px;max-height:0px;opacity:0;overflow:hidden;">ZjQcmQRYFpfptBannerStart</div>
<!--[if ((ie)|(mso))]>
<table border="0" cellspacing="0" cellpadding="0" width="100%" style="padding: 16px 0px 16px 0px; direction: ltr" ><tr><td>
<table border="0" cellspacing="0" cellpadding="0" style="padding: 0px 10px 5px 6px; width: 100%; border-radius:4px; border-top:4px solid #90a4ae;background-color:#D0D8DC;"><tr><td valign="top">
<table align="left" border="0" cellspacing="0" cellpadding="0" style="padding: 4px 8px 4px 8px">
<tr><td style="color:#000000; font-family: 'Arial', sans-serif; font-weight:bold; font-size:14px; direction: ltr">
This Message Is From an External Sender
</td></tr>
<tr><td style="color:#000000; font-weight:normal; font-family: 'Arial', sans-serif; font-size:12px; direction: ltr">
This message came from outside your organization.
</td></tr>
</table>
</td></tr></table>
</td></tr></table>
<![endif]-->
<![if !((ie)|(mso))]>
<div dir="ltr" id="pfptBanner7itka25" style="all: revert !important; display:block !important; text-align: left !important; margin:16px 0px 16px 0px !important; padding:8px 16px 8px 16px !important; border-radius: 4px !important; min-width: 200px !important; background-color: #D0D8DC !important; background-color: #D0D8DC; border-top: 4px solid #90a4ae !important; border-top: 4px solid #90a4ae;">
<div id="pfptBanner7itka25" style="all: unset !important; float:left !important; display:block !important; margin: 0px 0px 1px 0px !important; max-width: 600px !important;">
<div id="pfptBanner7itka25" style="all: unset !important; display:block !important; visibility: visible !important; background-color: #D0D8DC !important; color:#000000 !important; color:#000000; font-family: 'Arial', sans-serif !important; font-family: 'Arial', sans-serif; font-weight:bold !important; font-weight:bold; font-size:14px !important; line-height:18px !important; line-height:18px">
This Message Is From an External Sender
</div>
<div id="pfptBanner7itka25" style="all: unset !important; display:block !important; visibility: visible !important; background-color: #D0D8DC !important; color:#000000 !important; color:#000000; font-weight:normal; font-family: 'Arial', sans-serif !important; font-family: 'Arial', sans-serif; font-size:12px !important; line-height:18px !important; line-height:18px; margin-top:2px !important;">
This message came from outside your organization.
</div>
</div>
<div style="clear: both !important; display: block !important; visibility: hidden !important; line-height: 0 !important; font-size: 0.01px !important; height: 0px"> </div>
</div>
<![endif]>
<div style="display:none !important;display:none;visibility:hidden;mso-hide:all;font-size:1px;color:#ffffff;line-height:1px;max-height:0px;opacity:0;overflow:hidden;">ZjQcmQRYFpfptBannerEnd</div>
<!-- Email Banner : END -->
<!-- BaNnErBlUrFlE-BoDy-end -->
<html>
<head><!-- BaNnErBlUrFlE-HeAdEr-start -->
<style>
#pfptBanner7itka25 { all: revert !important; display: block !important;
visibility: visible !important; opacity: 1 !important;
background-color: #D0D8DC !important;
max-width: none !important; max-height: none !important }
.pfptPrimaryButton7itka25:hover, .pfptPrimaryButton7itka25:focus {
background-color: #b4c1c7 !important; }
.pfptPrimaryButton7itka25:active {
background-color: #90a4ae !important; }
</style>
<!-- BaNnErBlUrFlE-HeAdEr-end -->
<meta charset="UTF-8"></head><body><pre style="font-family: sans-serif; font-size: 100%; white-space: pre-wrap; word-wrap: break-word">On 2026/03/11 10:43, McMahon, Kim wrote:
>
> Cray MPICH detects "available" NICS on a node by calling the libfabric fi_getinto() call.
> ...
> I suspect your method for disabling the NIC via "STARTMODE='off'" does not fail these checks.
Indeed, Kim: indeed.
But anyroad,
I believe that I have just found a way to REALLY DISABLE the 2nd
NIC, as the node provisions, having tried quite a few that didn't
see it disabled, as far as Cray MPICH'a detection was concerned.
The "trick" seems to be to leave the STARTMODE, in
/etc/sysconfig/network/ifcfg-hsn1
as it was before I thought to override its default pre-boot
creation, via,
/etc/opt/sgi/conf.d/15-network-setup
so we now have
# Kevin says let it come up
STARTMODE='auto'
but then replace the ifcfg's POST_UP_SCRIPT argument, which
would normally be
POST_UP_SCRIPT="systemd:cm-slingshot-ama@.service"
with
POST_UP_SCRIPT="wicked:post-up/disable-hsn1-script"
where that script just runs an
/usr/sbin/wicked --systemd ifdown <interface>
FWIW, I had even tried something a bit more "old school"
than using the wicked-newness, first, vis:
POST_UP_SCRIPT="compat:suse:disable-hsn1-script"
with the same script payload, but that hadn't worked.
(Too old school perhaps?)
For completeness, here's what that last approach achieves:
Earlier attempts were leaving the interface thus:
nid002792:~ # /usr/sbin/wicked ifstatus hsn1
hsn1 device-unconfigured
link: #6, state up, mtu 9000
type: ethernet, hwaddr 02:00:00:00:60:33
nid002792:~ #
whereas, using that "new-fangled" wicked script, sees
nid002792:~ # /usr/sbin/wicked ifstatus hsn1
hsn1 device-unconfigured
link: #6, state down, mtu 9000
type: ethernet, hwaddr 02:00:00:00:60:33
nid002792:~ #
Pay special attention to the "state down".
As we only need to do this for a few nodes within the
EX here (bit of a much longer story!), I believe that
this approach will do what we need.
Rest assured though, that if it doesn't then, as I was
down to my last idea, we will probably raise it with
HPE/Cray, given that the underlying issue does seem to
have its provenance there, and not in upstream MPICH.
Thanks for the insight above,
Kevin M. Buckley
--
Supercomputing Systems Administrator
Pawsey Supercomputing Centre
PERTH
Australia
</pre></body></html>