<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<div dir="ltr" style="font-family: Aptos, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hi Howard,</div>
<div dir="ltr" style="font-family: Aptos, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Aptos, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
The PMIx stuff is likely related to the new sessions implementation coming in 5.0.x. I’ll look for a Slurm cluster to try and figure out what’s going on with that.</div>
<div dir="ltr" style="font-family: Aptos, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Aptos, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
What commit hash are you working with that shows the poor latency? I just built from the HEAD of main and don’t see the behavior on Aurora.</div>
<div dir="ltr" style="font-family: Aptos, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div dir="ltr" style="font-family: Aptos, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Ken</div>
<div dir="ltr" style="font-family: Aptos, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div id="mail-editor-reference-message-container" style="color: inherit; background-color: inherit;">
<div dir="ltr" class="ms-outlook-mobile-reference-message skipProofing"></div>
<div class="ms-outlook-mobile-reference-message skipProofing" style="text-align: left; padding: 3pt 0in 0in; border-width: 1pt medium medium; border-style: solid none none; border-color: rgb(181, 196, 223) currentcolor currentcolor; font-family: Aptos; font-size: 12pt; color: black;">
<b>From: </b>Howard Pritchard via discuss <discuss@mpich.org><br>
<b>Date: </b>Thursday, August 21, 2025 at 4:40 PM<br>
<b>To: </b>Zhou, Hui <zhouh@anl.gov><br>
<b>Cc: </b>Howard Pritchard <hppritcha@gmail.com>, discuss@mpich.org <discuss@mpich.org><br>
<b>Subject: </b>Re: [mpich-discuss] MPICH 5.0.1 performance on HPE SS11 plus more - a slurm problem<br>
<br>
</div>
<div dir="ltr" id="pfptBanner7jf65b4" style="background-color: rgb(208, 216, 220); visibility: visible !important; opacity: 1 !important; max-width: none !important; max-height: none !important; display: block !important; text-align: left !important; margin: 16px 0px !important; padding: 8px 16px !important; border-radius: 4px !important; min-width: 200px !important; border-top-width: 4px !important; border-top-style: solid !important; border-top-color: rgb(144, 164, 174) !important; color: inherit;">
<div id="pfptBanner7jf65b4" style="background-color: rgb(208, 216, 220); visibility: visible !important; opacity: 1 !important; max-height: none !important; float: left !important; display: block !important; margin: 0px 0px 1px !important; max-width: 600px !important; color: inherit;">
<div id="pfptBanner7jf65b4" style="background-color: rgb(208, 216, 220); color: rgb(0, 0, 0); opacity: 1 !important; max-width: none !important; max-height: none !important; display: block !important; visibility: visible !important; font-family: Arial, sans-serif !important; font-weight: bold !important; font-size: 14px !important; line-height: 18px !important;">
This Message Is From an External Sender</div>
<div id="pfptBanner7jf65b4" style="font-weight: normal; background-color: rgb(208, 216, 220); color: rgb(0, 0, 0); opacity: 1 !important; max-width: none !important; max-height: none !important; display: block !important; visibility: visible !important; font-family: Arial, sans-serif !important; font-size: 12px !important; line-height: 18px !important; margin-top: 2px !important;">
This message came from outside your organization.</div>
</div>
<div style="line-height: 0; height: 0px; display: block; font-size: 0.01px;"> </div>
</div>
<div dir="ltr" class="ms-outlook-mobile-reference-message skipProofing">Here you go Hui!</div>
<div dir="ltr" class="ms-outlook-mobile-reference-message skipProofing"><br>
</div>
<div dir="ltr" class="ms-outlook-mobile-reference-message skipProofing">MPICH debug output and slurm steps output to boot. Again no such slurmy errors with the 4.3.1 release.</div>
<div dir="ltr" class="ms-outlook-mobile-reference-message skipProofing">Something must have changed in the way MPICH is using the PMIX group constructor ops or something like that.</div>
<div dir="ltr" class="ms-outlook-mobile-reference-message skipProofing"><br>
</div>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
Required minimum FI_VERSION: 0, current version: 10016</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
provider: cxi, score = 5, pref = 0, FI_FORMAT_UNSPEC [8]</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
Required minimum FI_VERSION: 10005, current version: 10016</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
==== Capability set configuration ====</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
libfabric provider: cxi - cxi</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_DATA: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_AV_TABLE: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_SCALABLE_ENDPOINTS: 0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_SHARED_CONTEXTS: 0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_MR_VIRT_ADDRESS: 0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_MR_ALLOCATED: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_MR_REGISTER_NULL: 0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_MR_PROV_KEY: 0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_TAGGED: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_AM: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_RMA: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_ATOMICS: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_FETCH_ATOMIC_IOVECS: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_DATA_AUTO_PROGRESS: 0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_CONTROL_AUTO_PROGRESS: 0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_PT2PT_NOPACK: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_TRIGGERED: 0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_ENABLE_HMEM: 0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_NUM_AM_BUFFERS: 8</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_NUM_OPTIMIZED_MEMORY_REGIONS: 0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_CONTEXT_BITS: 20</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_SOURCE_BITS: 0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_TAG_BITS: 20</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_VNI_USE_DOMAIN: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MAXIMUM SUPPORTED RANKS: 4294967296</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MAXIMUM TAG: 1048576</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
==== Provider global thresholds ====</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
max_buffered_send: 192</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
max_buffered_write: 192</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
max_msg_size: 4294967295</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
max_order_raw: -1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
max_order_war: -1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
max_order_waw: -1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
tx_iov_limit: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
rx_iov_limit: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
rma_iov_limit: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
max_mr_key_size: 4</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
==== Various sizes and limits ====</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_AM_MSG_HEADER_SIZE: 24</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_MAX_AM_HDR_SIZE: 255</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
sizeof(MPIDI_OFI_am_request_header_t): 416</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
sizeof(MPIDI_OFI_per_vci_t): 52480</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_AM_HDR_POOL_CELL_SIZE: 1024</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIDI_OFI_DEFAULT_SHORT_SEND_SIZE: 16384</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
======================================</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_belong_chk: nid001406 [1]: pmixp_coll.c:280: No process controlled by this slurmstepd is involved in this collective.</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: _process_server_request: nid001406 [1]: pmixp_server.c:923: Unable to pmixp_state_coll_get()</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_check: nid001405 [0]: pmixp_coll_ring.c:614: 0x14b448006e10: unexpected contrib from nid001406:1, expected is 0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: _process_server_request: nid001405 [0]: pmixp_server.c:937: 0x14b448006e10: unexpected contrib from nid001406:1, coll->seq=0, seq=0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_reset_if_to: nid001405 [0]: pmixp_coll_ring.c:738: 0x14b454052fc0: collective timeout seq=0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_log: nid001405 [0]: pmixp_coll.c:286: Dumping collective state</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001405 [0]: pmixp_coll_ring.c:756: 0x14b454052fc0: COLL_FENCE_RING state seq=0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001405 [0]: pmixp_coll_ring.c:758: my peerid: 0:nid001405</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001405 [0]: pmixp_coll_ring.c:765: neighbor id: next 1:nid001406, prev 1:nid001406</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001405 [0]: pmixp_coll_ring.c:775: Context ptr=0x14b454053038, #0, in-use=0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001405 [0]: pmixp_coll_ring.c:775: Context ptr=0x14b454053070, #1, in-use=0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001405 [0]: pmixp_coll_ring.c:775: Context ptr=0x14b4540530a8, #2, in-use=1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001405 [0]: pmixp_coll_ring.c:786: seq=0 contribs: loc=1/prev=0/fwd=1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001405 [0]: pmixp_coll_ring.c:788: neighbor contribs [2]:</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001405 [0]: pmixp_coll_ring.c:821: done contrib: -</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001405 [0]: pmixp_coll_ring.c:823: wait contrib: nid001406</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001405 [0]: pmixp_coll_ring.c:825: status=PMIXP_COLL_RING_PROGRESS</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001405 [0]: pmixp_coll_ring.c:829: buf (offset/size): 36/16384</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_reset_if_to: nid001406 [1]: pmixp_coll_ring.c:738: 0x14aa28053100: collective timeout seq=0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_log: nid001406 [1]: pmixp_coll.c:286: Dumping collective state</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001406 [1]: pmixp_coll_ring.c:756: 0x14aa28053100: COLL_FENCE_RING state seq=0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001406 [1]: pmixp_coll_ring.c:758: my peerid: 1:nid001406</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001406 [1]: pmixp_coll_ring.c:765: neighbor id: next 0:nid001405, prev 0:nid001405</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001406 [1]: pmixp_coll_ring.c:775: Context ptr=0x14aa28053178, #0, in-use=0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001406 [1]: pmixp_coll_ring.c:775: Context ptr=0x14aa280531b0, #1, in-use=0</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001406 [1]: pmixp_coll_ring.c:775: Context ptr=0x14aa280531e8, #2, in-use=1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001406 [1]: pmixp_coll_ring.c:786: seq=0 contribs: loc=1/prev=0/fwd=1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001406 [1]: pmixp_coll_ring.c:788: neighbor contribs [2]:</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001406 [1]: pmixp_coll_ring.c:821: done contrib: -</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001406 [1]: pmixp_coll_ring.c:823: wait contrib: nid001405</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001406 [1]: pmixp_coll_ring.c:825: status=PMIXP_COLL_RING_PROGRESS</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001406 [1]: pmixp_coll_ring.c:829: buf (offset/size): 36/16384</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
==== Various sizes and limits ====</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
sizeof(MPIDI_per_vci_t): 128</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
==== collective selection ====</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIR_CVAR_DEVICE_COLLECTIVES: percoll</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPIR: MPII_coll_generic_json</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPID: MPIDI_coll_generic_json</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
MPID/shm: MPIDI_POSIX_coll_generic_json</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
==== OFI dynamic settings ====</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
num_vcis: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
num_nics: 1</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
======================================</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
error checking : disabled</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
QMPI : disabled</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
debugger support : disabled</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
thread level : MPI_THREAD_SINGLE</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
thread CS : per-vci</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
threadcomm : enabled</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
==== data structure summary ====</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
sizeof(MPIR_Comm): 1832</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
sizeof(MPIR_Request): 520</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
sizeof(MPIR_Datatype): 280</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
================================</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
# OSU MPI Latency Test v5.8</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
# Size Latency (us)</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
0 2.04</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
1 10.08</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
2 10.10</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
4 10.11</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
8 10.12</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
16 10.12</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
32 10.13</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
64 10.12</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
128 10.67</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
256 8.10</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
512 8.18</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
1024 8.11</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
2048 7.86</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
4096 7.80</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
8192 10.25</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
16384 11.04</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
32768 12.04</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
65536 14.05</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
131072 17.89</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
262144 24.61</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
524288 37.51</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
1048576 61.48</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
2097152 110.06</p>
<p dir="ltr" class="gmail-p1" style="line-height: normal; margin: 0px; font-family: Menlo; font-size: 11px; color: rgb(0, 0, 0);">
4194304 228.67</p>
<div dir="ltr" class="ms-outlook-mobile-reference-message skipProofing"><br>
</div>
<div dir="ltr" class="ms-outlook-mobile-reference-message skipProofing"><br>
</div>
<div dir="ltr" class="gmail_attr">Am Mi., 13. Aug. 2025 um 13:10 Uhr schrieb Zhou, Hui <<a href="mailto:zhouh@anl.gov">zhouh@anl.gov</a>>:</div>
<blockquote style="margin: 0px 0px 0px 0.8ex; padding-left: 1ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204);">
<div dir="ltr" class="msg-7763817599493531614" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hi Howard,</div>
<div dir="ltr" class="msg-7763817599493531614" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div dir="ltr" class="msg-7763817599493531614" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Could you run with `MPIR_CVAR_DEBUG_SUMMARY=1`? It should print some debug messages. I want to confirm it is running the `cxi` provider.</div>
<div dir="ltr" class="msg-7763817599493531614" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div dir="ltr" class="msg-7763817599493531614" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div dir="ltr" class="msg-7763817599493531614" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hui</div>
<div id="m_-7763817599493531614appendonsend" dir="ltr" style="color: inherit; background-color: inherit;">
</div>
<hr dir="ltr" style="display: inline-block; width: 98%;">
<div id="m_-7763817599493531614divRplyFwdMsg" dir="ltr" style="color: inherit; background-color: inherit;">
<span style="font-family: Calibri, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"><b>From:</b> Howard Pritchard <<a href="mailto:hppritcha@gmail.com" target="_blank">hppritcha@gmail.com</a>><br>
<b>Sent:</b> Wednesday, July 30, 2025 4:37 PM<br>
<b>To:</b> Thakur, Rajeev <<a href="mailto:thakur@anl.gov" target="_blank">thakur@anl.gov</a>><br>
<b>Cc:</b> <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a> <<a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a>>; Zhou, Hui <<a href="mailto:zhouh@anl.gov" target="_blank">zhouh@anl.gov</a>><br>
<b>Subject:</b> Re: [mpich-discuss] MPICH 5.0.1 performance on HPE SS11 plus more - a slurm problem</span>
<div> </div>
</div>
<table class="msg-7763817599493531614" cellspacing="0" cellpadding="0" style="text-indent: revert; line-height: revert; white-space: revert; background-color: revert; display: table; margin: revert; width: 100%; height: revert; table-layout: fixed; color: revert; box-sizing: border-box; border-collapse: collapse; border-spacing: 0px;">
<tbody>
<tr style="background-color: revert;">
<td class="msg-7763817599493531614" style="text-indent: revert; line-height: revert; white-space: revert; border-width: revert; border-style: revert; border-color: revert; background-color: rgb(166, 166, 166); padding: 7px 2px; word-break: revert; color: revert; width: 0px; height: revert;">
</td>
<td class="msg-7763817599493531614" style="text-align: left; text-indent: revert; line-height: revert; white-space: revert; border-width: revert; border-style: revert; border-color: revert; background-color: rgb(234, 234, 234); padding: 7px 5px 7px 15px; word-break: revert; color: rgb(33, 33, 33); width: 100%; height: revert;">
<div class="msg-7763817599493531614" style="text-align: left; text-indent: revert; line-height: revert; white-space: revert; font-family: wf_segoe-ui_normal, "Segoe UI", "Segoe WP", Tahoma, Arial, sans-serif; color: revert;">
<span style="letter-spacing: revert; background-color: revert; line-height: revert;">You don't often get email from
<a href="mailto:hppritcha@gmail.com" target="_blank">hppritcha@gmail.com</a>. <a href="https://urldefense.us/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!G_uCfscf7eWS!aeEBmF_DTUp_lE5ETFEZupObYvUZ6i54jdGlfV3tG05FKqEKN1UmnaLgx1W6epKD1rrcWzppMp6MXXLu$" target="_blank" originalsrc="https://urldefense.us/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!G_uCfscf7eWS!aeEBmF_DTUp_lE5ETFEZupObYvUZ6i54jdGlfV3tG05FKqEKN1UmnaLgx1W6epKD1rrcWzppMp6MXXLu$" style="color: revert; display: revert; background-color: revert;">
Learn why this is important</a></span></div>
</td>
<td class="msg-7763817599493531614" style="text-align: left; text-indent: revert; line-height: revert; white-space: revert; border-width: revert; border-style: revert; border-color: revert; background-color: rgb(234, 234, 234); padding: 7px 5px; word-break: revert; color: rgb(33, 33, 33); width: 75px; height: revert;">
</td>
</tr>
</tbody>
</table>
<div dir="ltr" id="m_-7763817599493531614x_pfptBannerbiv7aor" style="display: block; text-align: left; margin: 16px 0px; padding: 8px 16px; border-radius: 4px; min-width: 200px; background-color: rgb(208, 216, 220); border-top-width: 4px; border-top-style: solid; border-top-color: rgb(144, 164, 174); color: inherit;">
<div id="m_-7763817599493531614x_pfptBannerbiv7aor" style="float: left; display: block; margin: 0px 0px 1px; max-width: 600px; color: inherit; background-color: inherit;">
<div id="m_-7763817599493531614x_pfptBannerbiv7aor" style="display: block; background-color: rgb(208, 216, 220); color: rgb(0, 0, 0); font-family: Arial, sans-serif; font-weight: bold; font-size: 14px; line-height: 18px;">
This Message Is From an External Sender</div>
<div id="m_-7763817599493531614x_pfptBannerbiv7aor" style="font-weight: normal; display: block; background-color: rgb(208, 216, 220); color: rgb(0, 0, 0); font-family: Arial, sans-serif; font-size: 12px; line-height: 18px; margin-top: 2px;">
This message came from outside your organization.</div>
</div>
<div style="line-height: 0; height: 0px; display: block; font-size: 0.01px;"> </div>
</div>
<div dir="ltr" class="msg-7763817599493531614">Hi Rajeev,</div>
<div dir="ltr" class="msg-7763817599493531614"><br>
</div>
<div dir="ltr" class="msg-7763817599493531614">Here are the results for 4.3.x branch:</div>
<div dir="ltr" class="msg-7763817599493531614"><br>
</div>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
hpp@nid001293:/usr/projects/artab/users/hpp/osu-micro-benchmarks-5.8-mpich/mpi/pt2pt>srun --mpi=pmix -n 2 ./osu_latency</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
# OSU MPI Latency Test v5.8</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
# Size Latency (us)</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
0 1.92</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
1 1.98</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
2 1.98</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
4 1.98</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
8 1.98</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
16 1.98</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
32 1.99</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
64 1.99</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
128 2.47</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
256 2.59</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
512 2.65</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
1024 2.76</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
2048 2.95</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
4096 3.00</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
8192 5.96</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
16384 6.64</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
32768 7.44</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
65536 8.75</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
131072 11.52</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
262144 17.08</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
524288 27.96</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
1048576 49.38</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
2097152 92.96</p>
<p dir="ltr" class="msg-7763817599493531614" style="line-height: normal; background-color: rgb(254, 244, 139); margin: 0px; font-family: Menlo; font-size: 11px;">
4194304 179.74</p>
<div dir="ltr" class="msg-7763817599493531614"><br>
</div>
<div dir="ltr" class="msg-7763817599493531614">These are more like i would expect for SS11/OFI CXI provider.</div>
<div dir="ltr" class="msg-7763817599493531614"><br>
</div>
<div dir="ltr" class="msg-7763817599493531614">Howard</div>
<div dir="ltr" class="msg-7763817599493531614"><br>
</div>
<div dir="ltr" class="msg-7763817599493531614">Am Mi., 30. Juli 2025 um 12:48 Uhr schrieb Thakur, Rajeev <<a href="mailto:thakur@anl.gov" target="_blank">thakur@anl.gov</a>>:</div>
<blockquote style="margin: 0px 0px 0px 0.8ex; padding-left: 1ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204);">
<p dir="ltr" class="msg-7763817599493531614"><span style="font-family: "Lucida Grande", sans-serif; font-size: 11pt;">Hi Howard,</span></p>
<p dir="ltr" class="msg-7763817599493531614"><span style="font-family: "Lucida Grande", sans-serif; font-size: 11pt;"> What was the latency with the 4.3.x branch?</span></p>
<p dir="ltr" class="msg-7763817599493531614"><span style="font-family: "Lucida Grande", sans-serif; font-size: 11pt;"> </span></p>
<p dir="ltr" class="msg-7763817599493531614"><span style="font-family: "Lucida Grande", sans-serif; font-size: 11pt;">Rajeev</span></p>
<p dir="ltr" class="msg-7763817599493531614"><span style="font-family: "Lucida Grande", sans-serif; font-size: 11pt;"> </span></p>
<p dir="ltr" class="msg-7763817599493531614"><span style="font-family: "Lucida Grande", sans-serif; font-size: 11pt;"> </span></p>
<div style="padding: 3pt 0in 0in; border-width: 1pt medium medium; border-style: solid none none; border-color: rgb(181, 196, 223) currentcolor currentcolor;">
<p dir="ltr" class="msg-7763817599493531614"><span style="font-family: Calibri, sans-serif; color: black;"><b>From:
</b>Howard Pritchard via discuss <<a href="mailto:discuss@mpich.org" target="_blank" style="margin-top: 0px; margin-bottom: 0px;">discuss@mpich.org</a>><br>
<b>Reply-To: </b>"<a href="mailto:discuss@mpich.org" target="_blank" style="margin-top: 0px; margin-bottom: 0px;">discuss@mpich.org</a>" <<a href="mailto:discuss@mpich.org" target="_blank" style="margin-top: 0px; margin-bottom: 0px;">discuss@mpich.org</a>><br>
<b>Date: </b>Wednesday, July 30, 2025 at 1:43 PM<br>
<b>To: </b>"Zhou, Hui" <<a href="mailto:zhouh@anl.gov" target="_blank" style="margin-top: 0px; margin-bottom: 0px;">zhouh@anl.gov</a>><br>
<b>Cc: </b>Howard Pritchard <<a href="mailto:hppritcha@gmail.com" target="_blank" style="margin-top: 0px; margin-bottom: 0px;">hppritcha@gmail.com</a>>, "<a href="mailto:discuss@mpich.org" target="_blank" style="margin-top: 0px; margin-bottom: 0px;">discuss@mpich.org</a>"
<<a href="mailto:discuss@mpich.org" target="_blank" style="margin-top: 0px; margin-bottom: 0px;">discuss@mpich.org</a>><br>
<b>Subject: </b>Re: [mpich-discuss] MPICH 5.0.1 performance on HPE SS11 plus more - a slurm problem</span></p>
</div>
<p dir="ltr" class="msg-7763817599493531614"> </p>
<p dir="ltr" class="msg-7763817599493531614"><span style="font-size: 1pt; color: white;">Hi Hui That didn’t help. I am not surprised though as our cluster is an NVIDIA free zone. What did help is to switch to the mpich 4.</span><span style="font-family: Arial, sans-serif; font-size: 1pt; color: white;"> </span><span style="font-size: 1pt; color: white;">3.</span><span style="font-family: Arial, sans-serif; font-size: 1pt; color: white;"> </span><span style="font-size: 1pt; color: white;">x
branch and latency results are nominal and the slurm problem went away too. So we will stick with that branch.</span><span style="font-family: Arial, sans-serif; font-size: 1pt; color: white;"> </span></p>
<p dir="ltr" class="msg-7763817599493531614"><span style="font-size: 1pt; color: white;">ZjQcmQRYFpfptBannerStart</span></p>
<table dir="ltr" class="msg-7763817599493531614" cellspacing="0" cellpadding="0" style="width: 100%; box-sizing: border-box; border-collapse: collapse; border-spacing: 0px;">
<tbody>
<tr>
<td dir="ltr" class="msg-7763817599493531614" style="padding: 12pt 0in;">
<table dir="ltr" class="msg-7763817599493531614" cellspacing="0" cellpadding="0" style="border-width: 3pt medium medium; border-style: solid none none; border-color: rgb(144, 164, 174) currentcolor currentcolor; border-radius: 4px; background-color: rgb(208, 216, 220); width: 100%; box-sizing: border-box; border-collapse: collapse; border-spacing: 0px;">
<tbody>
<tr>
<td dir="ltr" class="msg-7763817599493531614" style="border-width: medium; border-style: none; border-color: currentcolor; padding: 0in 7.5pt 3.75pt 4.5pt; vertical-align: top;">
<table dir="ltr" class="msg-7763817599493531614" align="left" cellspacing="0" cellpadding="0" style="box-sizing: border-box; border-collapse: collapse; border-spacing: 0px;">
<tbody>
<tr>
<td dir="ltr" class="msg-7763817599493531614" style="padding: 3pt 6pt;">
<p dir="ltr" class="msg-7763817599493531614"><span style="font-family: Arial, sans-serif; font-size: 10.5pt; color: black;"><b>This Message Is From an External Sender</b></span></p>
</td>
</tr>
<tr>
<td dir="ltr" class="msg-7763817599493531614" style="padding: 3pt 6pt;">
<p dir="ltr" class="msg-7763817599493531614"><span style="font-family: Arial, sans-serif; font-size: 9pt; color: black;">This message came from outside your organization.</span></p>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<p dir="ltr" class="msg-7763817599493531614"><span style="font-size: 1pt; color: white;">ZjQcmQRYFpfptBannerEnd</span></p>
<p dir="ltr" class="msg-7763817599493531614">Hi Hui</p>
<p dir="ltr" class="msg-7763817599493531614"> </p>
<p dir="ltr" class="msg-7763817599493531614">That didn’t help. I am not surprised though as our cluster is an NVIDIA free zone. What did help is to switch to the mpich 4.3.x branch and latency results are nominal and the slurm problem went away too. So we
will stick with that branch.</p>
<p dir="ltr" class="msg-7763817599493531614"> </p>
<p dir="ltr" class="msg-7763817599493531614">Howard</p>
<p dir="ltr" class="msg-7763817599493531614"> </p>
<p dir="ltr" class="msg-7763817599493531614">On Mon, Jul 28, 2025 at 4:15<span style="font-family: Arial, sans-serif;"> </span>PM Zhou, Hui <<a href="mailto:zhouh@anl.gov" target="_blank" style="margin-top: 0px; margin-bottom: 0px;">zhouh@anl.gov</a>> wrote:</p>
<blockquote style="margin-right: 0in; margin-left: 4.8pt; padding: 0in 0in 0in 6pt; border-width: medium medium medium 1pt; border-style: none none none solid; border-color: currentcolor currentcolor currentcolor rgb(204, 204, 204);">
<p dir="ltr" class="msg-7763817599493531614"><span style="color: black;">Hi Howard,</span></p>
<p dir="ltr" class="msg-7763817599493531614"><span style="color: black;"> </span></p>
<p dir="ltr" class="msg-7763817599493531614"><span style="color: black;"> I wonder whether it is due to the overhead of querying pointer attributes. Could you try disable GPU support with `MPIR_CVAR_ENABLE_GPU=0` and see if the latency improves?</span></p>
<p dir="ltr" class="msg-7763817599493531614"><span style="color: black;"> </span></p>
<p dir="ltr" class="msg-7763817599493531614"><span style="color: black;">Hui</span></p>
<hr dir="ltr" align="center" style="width: 100%;">
<div id="m_-7763817599493531614x_m_-2350182920575357441m_6369988255225199108divRplyFwdMsg" style="color: inherit; background-color: inherit;">
<p><span style="font-family: Calibri, sans-serif; font-size: 11pt; color: black;"><b>From:</b> Howard Pritchard via discuss <<a href="mailto:discuss@mpich.org" target="_blank" style="margin-top: 0px; margin-bottom: 0px;">discuss@mpich.org</a>><br>
<b>Sent:</b> Monday, July 28, 2025 9:41 AM<br>
<b>To:</b> <a href="mailto:discuss@mpich.org" target="_blank" style="margin-top: 0px; margin-bottom: 0px;">
discuss@mpich.org</a> <<a href="mailto:discuss@mpich.org" target="_blank" style="margin-top: 0px; margin-bottom: 0px;">discuss@mpich.org</a>><br>
<b>Cc:</b> Howard Pritchard <<a href="mailto:hppritcha@gmail.com" target="_blank" style="margin-top: 0px; margin-bottom: 0px;">hppritcha@gmail.com</a>><br>
<b>Subject:</b> [mpich-discuss] MPICH 5.0.1 performance on HPE SS11 plus more - a slurm problem</span></p>
<p> </p>
</div>
<p dir="ltr" class="msg-7763817599493531614"><span style="font-size: 1pt; color: white;">Hi Folks, We are seeing a strange performance issue on our HPE SS11 system when testing osu_latency inter-node with MPICH. First the info: system using libfabric 1.</span><span style="font-family: Arial, sans-serif; font-size: 1pt; color: white;"> </span><span style="font-size: 1pt; color: white;">22.</span><span style="font-family: Arial, sans-serif; font-size: 1pt; color: white;"> </span><span style="font-size: 1pt; color: white;">0
slurm - 24.</span><span style="font-family: Arial, sans-serif; font-size: 1pt; color: white;"> </span><span style="font-size: 1pt; color: white;">11.</span><span style="font-family: Arial, sans-serif; font-size: 1pt; color: white;"> </span><span style="font-size: 1pt; color: white;">5
Here's my mpichversion output: MPICH Version:</span><span style="font-family: Arial, sans-serif; font-size: 1pt; color: white;"> </span><span style="font-size: 1pt; color: white;"> 5.</span><span style="font-family: Arial, sans-serif; font-size: 1pt; color: white;"> </span><span style="font-size: 1pt; color: white;">0.</span><span style="font-family: Arial, sans-serif; font-size: 1pt; color: white;"> </span><span style="font-size: 1pt; color: white;">0a1</span></p>
<p dir="ltr" class="msg-7763817599493531614"><span style="font-size: 1pt; color: white;">ZjQcmQRYFpfptBannerStart</span></p>
<div id="m_-7763817599493531614x_m_-2350182920575357441m_6369988255225199108x_pfptBanner53g6uvq" style="border-width: initial; border-right-style: none; border-bottom-style: none; border-left-style: none; border-right-color: initial; border-bottom-color: initial; border-left-color: initial; display: block; text-align: left; margin: 0px; padding: 16px; border-radius: 4px; min-width: 200px; background-color: rgb(208, 216, 220); border-top-style: initial; border-top-color: rgb(144, 164, 174); color: inherit;">
<div id="m_-7763817599493531614x_m_-2350182920575357441m_6369988255225199108x_pfptBanner53g6uvq" style="color: inherit; background-color: inherit;">
<div id="m_-7763817599493531614x_m_-2350182920575357441m_6369988255225199108x_pfptBanner53g6uvq" style="color: inherit; background-color: inherit;">
<p style="line-height: 13.5pt; background-color: rgb(208, 216, 220);"><span style="font-family: Arial, sans-serif; color: black;"><b>This Message Is From an External Sender</b></span></p>
</div>
<div id="m_-7763817599493531614x_m_-2350182920575357441m_6369988255225199108x_pfptBanner53g6uvq" style="color: inherit; background-color: inherit;">
<p style="line-height: 13.5pt; background-color: rgb(208, 216, 220);"><span style="font-family: Arial, sans-serif; color: black;">This message came from outside your organization.</span></p>
</div>
</div>
<p style="background-color: rgb(208, 216, 220);"><span style="color: black;"> </span></p>
</div>
<p dir="ltr" class="msg-7763817599493531614"><span style="font-size: 1pt; color: white;">ZjQcmQRYFpfptBannerEnd</span></p>
<p dir="ltr" class="msg-7763817599493531614">Hi Folks,</p>
<p dir="ltr" class="msg-7763817599493531614"> </p>
<p dir="ltr" class="msg-7763817599493531614">We are seeing a strange performance issue on our HPE SS11 system when testing osu_latency inter-node with MPICH.</p>
<p dir="ltr" class="msg-7763817599493531614"> </p>
<p dir="ltr" class="msg-7763817599493531614">First the info:</p>
<p dir="ltr" class="msg-7763817599493531614">system using libfabric 1.22.0</p>
<p dir="ltr" class="msg-7763817599493531614">slurm - 24.11.5</p>
<p dir="ltr" class="msg-7763817599493531614"> </p>
<p dir="ltr" class="msg-7763817599493531614">Here's my mpichversion output:</p>
<p dir="ltr" class="msg-7763817599493531614"> </p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">MPICH Version: 5.0.0a1</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">MPICH Release date: unreleased development copy</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">MPICH ABI: 0:0:0</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">MPICH Device: ch4:ofi</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">MPICH configure: --prefix=/XXXX/mpich_again/install --enable-g=no --enable-error-checking=no --with-device=ch4:ofi --enable-threads=multiple --with-ch4-shmmods=posix,xpmem --enable-thread-cs=per-vci
--with-libfabric=/opt/cray/libfabric/1.22.0 --with-xpmem=/opt/cray/xpmem/default --with-pmix=/opt/pmix/gcc4x/5.0.8 --enable-fast=O3</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">MPICH CC: gcc -O3</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">MPICH CXX: g++ -O3</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">MPICH F77: gfortran -O3</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">MPICH FC: gfortran -O3</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">MPICH features: threadcomm<br>
<br>
<br>
<br>
And here's the OSU latency results:<br>
<br>
<br>
</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_belong_chk: nid001439 [1]: pmixp_coll.c:280: No process controlled by this slurmstepd is involved in this collective.</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: _process_server_request: nid001439 [1]: pmixp_server.c:923: Unable to pmixp_state_coll_get()</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_check: nid001438 [0]: pmixp_coll_ring.c:614: 0x15005c005dc0: unexpected contrib from nid001439:1, expected is 0</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: _process_server_request: nid001438 [0]: pmixp_server.c:937: 0x15005c005dc0: unexpected contrib from nid001439:1, coll->seq=0, seq=0</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_reset_if_to: nid001438 [0]: pmixp_coll_ring.c:738: 0x1500580532f0: collective timeout seq=0</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_log: nid001438 [0]: pmixp_coll.c:286: Dumping collective state</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001438 [0]: pmixp_coll_ring.c:756: 0x1500580532f0: COLL_FENCE_RING state seq=0</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001438 [0]: pmixp_coll_ring.c:758: my peerid: 0:nid001438</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001438 [0]: pmixp_coll_ring.c:765: neighbor id: next 1:nid001439, prev 1:nid001439</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001438 [0]: pmixp_coll_ring.c:775: Context ptr=0x150058053368, #0, in-use=0</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001438 [0]: pmixp_coll_ring.c:775: Context ptr=0x1500580533a0, #1, in-use=0</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001438 [0]: pmixp_coll_ring.c:775: Context ptr=0x1500580533d8, #2, in-use=1</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001438 [0]: pmixp_coll_ring.c:786: seq=0 contribs: loc=1/prev=0/fwd=1</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001438 [0]: pmixp_coll_ring.c:788: neighbor contribs [2]:</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001438 [0]: pmixp_coll_ring.c:821: done contrib: -</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001438 [0]: pmixp_coll_ring.c:823: wait contrib: nid001439</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001438 [0]: pmixp_coll_ring.c:825: status=PMIXP_COLL_RING_PROGRESS</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001438 [0]: pmixp_coll_ring.c:829: buf (offset/size): 36/16384</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_reset_if_to: nid001439 [1]: pmixp_coll_ring.c:738: 0x151d0c053400: collective timeout seq=0</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_log: nid001439 [1]: pmixp_coll.c:286: Dumping collective state</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001439 [1]: pmixp_coll_ring.c:756: 0x151d0c053400: COLL_FENCE_RING state seq=0</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001439 [1]: pmixp_coll_ring.c:758: my peerid: 1:nid001439</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001439 [1]: pmixp_coll_ring.c:765: neighbor id: next 0:nid001438, prev 0:nid001438</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001439 [1]: pmixp_coll_ring.c:775: Context ptr=0x151d0c053478, #0, in-use=0</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001439 [1]: pmixp_coll_ring.c:775: Context ptr=0x151d0c0534b0, #1, in-use=0</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001439 [1]: pmixp_coll_ring.c:775: Context ptr=0x151d0c0534e8, #2, in-use=1</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001439 [1]: pmixp_coll_ring.c:786: seq=0 contribs: loc=1/prev=0/fwd=1</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001439 [1]: pmixp_coll_ring.c:788: neighbor contribs [2]:</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001439 [1]: pmixp_coll_ring.c:821: done contrib: -</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001439 [1]: pmixp_coll_ring.c:823: wait contrib: nid001438</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001439 [1]: pmixp_coll_ring.c:825: status=PMIXP_COLL_RING_PROGRESS</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">slurmstepd: error: mpi/pmix_v4: pmixp_coll_ring_log: nid001439 [1]: pmixp_coll_ring.c:829: buf (offset/size): 36/16384</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;"># OSU MPI Latency Test v5.8</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;"># Size Latency (us)</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">0 1.66</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">1 9.29</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">2 9.57</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">4 9.69</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">8 9.76</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">16 9.77</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">32 9.76</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">64 9.77</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">128 10.32</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">256 7.54</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">512 7.45</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">1024 7.38</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">2048 7.37</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">4096 7.45</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">8192 9.21</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">16384 9.70</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">32768 10.63</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">65536 13.15</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">131072 16.96</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">262144 23.84</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">524288 36.16</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">1048576 60.36</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">2097152 108.43</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">4194304 228.31</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;"><br>
<br>
</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">Note the slurm behavior is - I launch the job. Go get coffee, do some duo-lingo, read some emails, then after about 10 minutes the osu latency runs.</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;"><br>
<br>
</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">I did not get the slurm problems using an older mpich 4.3.1 but did get the same performance issue. 9 usecs doesn't seem right for an 8-byte pingpong over libfabric S11. I was expecting more
like 1.6 or so.</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;"><br>
<br>
</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;">I am confident the slurm issue is unrelated to the latency issue.<br>
<br>
Thanks for any suggestions on how to address either issue however.</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt; color: black;"><br>
<br>
</span></p>
<p dir="ltr" class="msg-7763817599493531614" style="background-color: rgb(254, 244, 139); margin: 0in;">
<span style="font-family: Menlo; font-size: 8.5pt;"> </span></p>
</blockquote>
</blockquote>
</blockquote>
</div>
</body>
</html>