<!-- BaNnErBlUrFlE-BoDy-start -->
<!-- Preheader Text : BEGIN -->
<div style="display:none !important;display:none;visibility:hidden;mso-hide:all;font-size:1px;color:#ffffff;line-height:1px;height:0px;max-height:0px;opacity:0;overflow:hidden;">
I may have spoken too soon. This fixed the issue on one SMP node using 24 processes. When I increase the number of nodes to 4 with 96 processes, I can’t even get past MPI_Init. I’m seeing the following in standard out ===================================================================================</div>
<!-- Preheader Text : END -->
<!-- Email Banner : BEGIN -->
<div style="display:none !important;display:none;visibility:hidden;mso-hide:all;font-size:1px;color:#ffffff;line-height:1px;height:0px;max-height:0px;opacity:0;overflow:hidden;">ZjQcmQRYFpfptBannerStart</div>
<!--[if ((ie)|(mso))]>
<table border="0" cellspacing="0" cellpadding="0" width="100%" style="padding: 16px 0px 16px 0px; direction: ltr" ><tr><td>
<table border="0" cellspacing="0" cellpadding="0" style="padding: 0px 10px 5px 6px; width: 100%; border-radius:4px; border-top:4px solid #90a4ae;background-color:#D0D8DC;"><tr><td valign="top">
<table align="left" border="0" cellspacing="0" cellpadding="0" style="padding: 4px 8px 4px 8px">
<tr><td style="color:#000000; font-family: 'Arial', sans-serif; font-weight:bold; font-size:14px; direction: ltr">
This Message Is From an External Sender
</td></tr>
<tr><td style="color:#000000; font-weight:normal; font-family: 'Arial', sans-serif; font-size:12px; direction: ltr">
This message came from outside your organization.
</td></tr>
</table>
</td></tr></table>
</td></tr></table>
<![endif]-->
<![if !((ie)|(mso))]>
<div dir="ltr" id="pfptBanner0bdk3bc" style="all: revert !important; display:block !important; text-align: left !important; margin:16px 0px 16px 0px !important; padding:8px 16px 8px 16px !important; border-radius: 4px !important; min-width: 200px !important; background-color: #D0D8DC !important; background-color: #D0D8DC; border-top: 4px solid #90a4ae !important; border-top: 4px solid #90a4ae;">
<div id="pfptBanner0bdk3bc" style="all: unset !important; float:left !important; display:block !important; margin: 0px 0px 1px 0px !important; max-width: 600px !important;">
<div id="pfptBanner0bdk3bc" style="all: unset !important; display:block !important; visibility: visible !important; background-color: #D0D8DC !important; color:#000000 !important; color:#000000; font-family: 'Arial', sans-serif !important; font-family: 'Arial', sans-serif; font-weight:bold !important; font-weight:bold; font-size:14px !important; line-height:18px !important; line-height:18px">
This Message Is From an External Sender
</div>
<div id="pfptBanner0bdk3bc" style="all: unset !important; display:block !important; visibility: visible !important; background-color: #D0D8DC !important; color:#000000 !important; color:#000000; font-weight:normal; font-family: 'Arial', sans-serif !important; font-family: 'Arial', sans-serif; font-size:12px !important; line-height:18px !important; line-height:18px; margin-top:2px !important;">
This message came from outside your organization.
</div>
</div>
<div style="clear: both !important; display: block !important; visibility: hidden !important; line-height: 0 !important; font-size: 0.01px !important; height: 0px"> </div>
</div>
<![endif]>
<div style="display:none !important;display:none;visibility:hidden;mso-hide:all;font-size:1px;color:#ffffff;line-height:1px;height:0px;max-height:0px;opacity:0;overflow:hidden;">ZjQcmQRYFpfptBannerEnd</div>
<!-- Email Banner : END -->
<!-- BaNnErBlUrFlE-BoDy-end -->
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head><!-- BaNnErBlUrFlE-HeAdEr-start -->
<style>
#pfptBanner0bdk3bc { all: revert !important; display: block !important;
visibility: visible !important; opacity: 1 !important;
background-color: #D0D8DC !important;
max-width: none !important; max-height: none !important }
.pfptPrimaryButton0bdk3bc:hover, .pfptPrimaryButton0bdk3bc:focus {
background-color: #b4c1c7 !important; }
.pfptPrimaryButton0bdk3bc:active {
background-color: #90a4ae !important; }
</style>
<!-- BaNnErBlUrFlE-HeAdEr-end -->
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Aptos;
panose-1:2 11 0 4 2 2 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:12.0pt;
font-family:"Aptos",sans-serif;}
code
{mso-style-priority:99;
font-family:"Courier New";}
p.xmsonormal, li.xmsonormal, div.xmsonormal
{mso-style-name:xmsonormal;
margin:0in;
font-size:11.0pt;
font-family:"Aptos",sans-serif;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#467886" vlink="#96607D" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt">I may have spoken too soon. This fixed the issue on one SMP node using 24 processes. When I increase the number of nodes to 4 with 96 processes, I can’t even get past MPI_Init. I’m seeing the following in
standard out<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">===================================================================================<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">= PID 483326 RUNNING AT j006<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">= EXIT CODE: 9<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">= CLEANING UP REMAINING PROCESSES<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">===================================================================================<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">This typically refers to a problem with your application.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Please see the FAQ page for debugging suggestions<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">I get the following in standard error:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Abort(613541135) on node 65: Fatal error in internal_Init: Other MPI error, error stack:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">internal_Init(49162).............: MPI_Init(argc=0x7ffff3c5646c, argv=0x7ffff3c56460) failed<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">MPII_Init_thread(265)............: <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">MPIR_init_comm_world(34).........: <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">MPIR_Comm_commit(823)............: <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">MPID_Comm_commit_post_hook(222)..: <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">MPIDI_world_post_init(660).......: <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">MPIDI_OFI_init_vcis(842).........: <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">check_num_nics(891)..............: <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">MPIR_Allreduce_allcomm_auto(4726): <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">MPIC_Sendrecv(302)...............: <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">MPID_Isend(63)...................: <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">MPIDI_isend(35)..................: <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">MPIDI_OFI_send_fallback(549).....: OFI call tsendv failed (ofi_send.h:549:MPIDI_OFI_send_fallback:No such file or dir<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">ectory)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Bruce<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<div id="mail-editor-reference-message-container">
<div>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="color:black">From:
</span></b><span style="color:black">Palmer, Bruce J via discuss <discuss@mpich.org><br>
<b>Date: </b>Monday, October 7, 2024 at 1:14</span><span style="font-family:"Arial",sans-serif;color:black"> </span><span style="color:black">PM<br>
<b>To: </b>Zhou, Hui <zhouh@anl.gov>, discuss@mpich.org <discuss@mpich.org><br>
<b>Cc: </b>Palmer, Bruce J <Bruce.Palmer@pnnl.gov><br>
<b>Subject: </b>Re: [mpich-discuss] Maximum number of communicators<o:p></o:p></span></p>
</div>
<div style="border:none;border-left:solid #D77600 6.0pt;padding:0in 0in 0in 0in;font-size:1.15rem">
<p class="MsoNormal" align="center" style="text-align:center;background:#F7E3CC">
<span style="font-family:"Arial",sans-serif;color:black">Check twice before you click! This email originated from outside PNNL.</span><span style="font-family:"Arial",sans-serif"><o:p></o:p></span></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal" style="mso-line-height-alt:.75pt"><span style="font-size:1.0pt;color:white">That seemed to fix the issue. Thanks! Bruce From: Zhou, Hui <zhouh@</span><span style="font-size:1.0pt;font-family:"Arial",sans-serif;color:white"> </span><span style="font-size:1.0pt;color:white">anl.</span><span style="font-size:1.0pt;font-family:"Arial",sans-serif;color:white"> </span><span style="font-size:1.0pt;color:white">gov>
Date: Thursday, October 3, 2024 at 11:</span><span style="font-size:1.0pt;font-family:"Arial",sans-serif;color:white"> </span><span style="font-size:1.0pt;color:white">01 AM To: discuss@</span><span style="font-size:1.0pt;font-family:"Arial",sans-serif;color:white"> </span><span style="font-size:1.0pt;color:white">mpich.</span><span style="font-size:1.0pt;font-family:"Arial",sans-serif;color:white"> </span><span style="font-size:1.0pt;color:white">org
<discuss@</span><span style="font-size:1.0pt;font-family:"Arial",sans-serif;color:white"> </span><span style="font-size:1.0pt;color:white">mpich.</span><span style="font-size:1.0pt;font-family:"Arial",sans-serif;color:white"> </span><span style="font-size:1.0pt;color:white">org>
Cc: Palmer, Bruce J <Bruce.</span><span style="font-size:1.0pt;font-family:"Arial",sans-serif;color:white"> </span><span style="font-size:1.0pt;color:white">Palmer@</span><span style="font-size:1.0pt;font-family:"Arial",sans-serif;color:white"> </span><span style="font-size:1.0pt;color:white">pnnl.</span><span style="font-size:1.0pt;font-family:"Arial",sans-serif;color:white"> </span><span style="font-size:1.0pt;color:white">gov>
Subject: Re: Maximum<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="mso-line-height-alt:.75pt"><span style="font-size:1.0pt;color:white">ZjQcmQRYFpfptBannerStart<o:p></o:p></span></p>
</div>
<div style="border:none;border-top:solid #90A4AE 3.0pt;padding:0in 0in 0in 0in;display:block!important;text-align:left!important;margin:0px!important;padding:16px!important;border-radius:4px!important;min-width:200px!important;background-color:#D0D8DC!important;border-top:#90a4ae!important" id="pfptBannerj173umt">
<div id="pfptBannerj173umt">
<div id="pfptBannerj173umt">
<p class="MsoNormal" style="line-height:13.5pt;background:#D0D8DC"><b><span style="font-family:"Arial",sans-serif;color:black">This Message Is From an External Sender
<o:p></o:p></span></b></p>
</div>
<div id="pfptBannerj173umt">
<p class="MsoNormal" style="line-height:13.5pt;background:#D0D8DC"><span style="font-family:"Arial",sans-serif;color:black">This message came from outside your organization.
<o:p></o:p></span></p>
</div>
</div>
<div>
<p class="MsoNormal" style="background:#D0D8DC"><span style="color:black"> </span></p>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-line-height-alt:.75pt"><span style="font-size:1.0pt;color:white">ZjQcmQRYFpfptBannerEnd<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt">That seemed to fix the issue.</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Thanks!</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Bruce</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> </span></p>
<div id="mail-editor-reference-message-container">
<div>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="color:black">From:
</span></b><span style="color:black">Zhou, Hui <zhouh@anl.gov><br>
<b>Date: </b>Thursday, October 3, 2024 at 11:01</span><span style="font-family:"Arial",sans-serif;color:black"> </span><span style="color:black">AM<br>
<b>To: </b>discuss@mpich.org <discuss@mpich.org><br>
<b>Cc: </b>Palmer, Bruce J <Bruce.Palmer@pnnl.gov><br>
<b>Subject: </b>Re: Maximum number of communicators</span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">Hi Bruce,</span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black"> </span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">Try configure mpich using </span><code><span style="font-size:10.0pt;color:black">--with-device=ch4:ofi --enable-extended-context-bits</span></code><span style="color:black">.</span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black"> </span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">Hui</span></p>
</div>
<div class="MsoNormal" align="center" style="text-align:center">
<hr size="0" width="98%" align="center">
</div>
<div id="divRplyFwdMsg">
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:black">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:black"> Palmer, Bruce J via discuss <discuss@mpich.org><br>
<b>Sent:</b> Thursday, October 3, 2024 11:44 AM<br>
<b>To:</b> discuss@mpich.org <discuss@mpich.org><br>
<b>Cc:</b> Palmer, Bruce J <Bruce.Palmer@pnnl.gov><br>
<b>Subject:</b> [mpich-discuss] Maximum number of communicators</span> </p>
<div>
<p class="MsoNormal"> </p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:1.0pt;color:white">Hi, I’m looking at using MPI RMA to support sparse data structures in Global Arrays. I’ve got an application that uses a large number of sparse arrays and it is failing when the number of sparse
arrays reaches about 500. Each sparse array is</span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:1.0pt;color:white">ZjQcmQRYFpfptBannerStart</span></p>
</div>
<div style="border:none;border-top:solid #90A4AE 3.0pt;padding:0in 0in 0in 0in;display:block!important;text-align:left!important;margin:0px!important;padding:16px!important;border-radius:4px!important;min-width:200px!important;background-color:#D0D8DC!important;border-top:#90a4ae!important" id="x_pfptBannerbjq8f04">
<div id="x_pfptBannerbjq8f04">
<div id="x_pfptBannerbjq8f04">
<p class="MsoNormal" style="line-height:13.5pt;background:#D0D8DC"><b><span style="font-family:"Arial",sans-serif;color:black">This Message Is From an External Sender
</span></b></p>
</div>
<div id="x_pfptBannerbjq8f04">
<p class="MsoNormal" style="line-height:13.5pt;background:#D0D8DC"><span style="font-family:"Arial",sans-serif;color:black">This message came from outside your organization.
</span></p>
</div>
</div>
<div>
<p class="MsoNormal" style="background:#D0D8DC"><span style="color:black"> </span></p>
</div>
</div>
<div>
<p class="MsoNormal"><span style="font-size:1.0pt;color:white">ZjQcmQRYFpfptBannerEnd</span></p>
</div>
<div>
<p class="xmsonormal">Hi,</p>
<p class="xmsonormal"> </p>
<p class="xmsonormal">I’m looking at using MPI RMA to support sparse data structures in Global Arrays. I’ve got an application that uses a large number of sparse arrays and it is failing when the number of sparse arrays reaches about 500. Each sparse array
is built on top of 4 conventional global arrays and each global array uses one MPI Window. Each Window appears to be creating its own communicator and I’m hitting an internal limit at 2048 communicators. Is there a way to increase the number of communicators?</p>
<p class="xmsonormal"> </p>
<p class="xmsonormal">Bruce Palmer</p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>