<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Aptos;
panose-1:2 11 0 4 2 2 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-ligatures:standardcontextual;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Aptos",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body lang="EN-US" link="#0563C1" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-family:"Aptos",sans-serif">Hi Edric,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Aptos",sans-serif"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Aptos",sans-serif">It looks like we may have unintentionally inverted the logic on that function return value. I’ll submit a PR to fix. Thanks for bringing it to our attention.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Aptos",sans-serif"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Aptos",sans-serif">Ken<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Aptos",sans-serif"><o:p> </o:p></span></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-left:.5in"><b><span style="color:black">From: </span>
</b><span style="color:black">Edric Ellis via discuss <discuss@mpich.org><br>
<b>Reply-To: </b>"discuss@mpich.org" <discuss@mpich.org><br>
<b>Date: </b>Monday, April 22, 2024 at 9:44 AM<br>
<b>To: </b>"discuss@mpich.org" <discuss@mpich.org><br>
<b>Cc: </b>Edric Ellis <eellis@mathworks.com><br>
<b>Subject: </b>[mpich-discuss] Environment variable forwarding using Hydra and "-launcher ssh"</span><span style="font-size:12.0pt;color:black;mso-ligatures:none"><o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-family:"Aptos",sans-serif"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in;mso-line-height-alt:.75pt"><span style="font-size:1.0pt;color:white">We’re in the process of moving from mpich-3. x to mpich-4. 1. 2. We’ve run into some odd behaviour on SLURM related to environment variable forwarding
by mpiexec. It looks like mpiexec now propagates only SLURM_* environment variables,
<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in;mso-line-height-alt:.75pt"><span style="font-size:1.0pt;color:white">ZjQcmQRYFpfptBannerStart<o:p></o:p></span></p>
</div>
<table class="MsoNormalTable" border="0" cellspacing="0" cellpadding="0" width="100%" style="width:100.0%;margin-left:.5in;border-radius:4px">
<tbody>
<tr>
<td style="padding:12.0pt 0in 12.0pt 0in">
<table class="MsoNormalTable" border="1" cellspacing="0" cellpadding="0" width="100%" style="width:100.0%;background:#D0D8DC;border:none;border-top:solid #90A4AE 3.0pt">
<tbody>
<tr>
<td valign="top" style="border:none;padding:0in 7.5pt 3.75pt 4.5pt">
<table class="MsoNormalTable" border="0" cellspacing="0" cellpadding="0" align="left">
<tbody>
<tr>
<td style="padding:3.0pt 6.0pt 3.0pt 6.0pt">
<p class="MsoNormal"><b><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black">This Message Is From an External Sender
<o:p></o:p></span></b></p>
</td>
</tr>
<tr>
<td style="padding:3.0pt 6.0pt 3.0pt 6.0pt">
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black">This message came from outside your organization.
<o:p></o:p></span></p>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<div>
<p class="MsoNormal" style="margin-left:.5in;mso-line-height-alt:.75pt"><span style="font-size:1.0pt;color:white">ZjQcmQRYFpfptBannerEnd</span><span style="font-size:1.0pt;font-family:"Aptos",sans-serif;color:white"><o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="margin-left:.5in">We’re in the process of moving from mpich-3.x to mpich-4.1.2. We’ve run into some odd behaviour on SLURM related to environment variable forwarding by mpiexec. It looks like mpiexec now propagates only SLURM_* environment
variables, instead of filtering them out (or intending to). Consider something like this:<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">$ mpiexec -launcher slurm printenv HOME SLURM_JOBID<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">Using mpich-3.x, the HOME variable gets forward. Using mpich-4.1.2, it does not. I believe that mpich-3.x intends to filter out SLURM_JOBID, but the value still seems to be present, maybe srun forwards that. It’s
the fact that HOME doesn’t get through using mpich-4.1.2 that is causing us problems.<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">Running mpich-<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">Here’s what I think is the relevant change for SLURM:
<a href="https://urldefense.us/v3/__https:/github.com/pmodels/mpich/commit/95ba4ddc7efc7ddc7f25ed41480ee35248184680__;!!G_uCfscf7eWS!brQm1StWngU3EbSpC0Df2zQCAvifuBeZbPxODF7IvoCSVfssx6981wRQlhd_U21YOFIC7DJL8npL9gU$">
https://github.com/pmodels/mpich/commit/95ba4ddc7efc7ddc7f25ed41480ee35248184680</a> . Am I reading that correctly?<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">The doc here <a href="https://urldefense.us/v3/__https:/github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md*environment-settings__;Iw!!G_uCfscf7eWS!brQm1StWngU3EbSpC0Df2zQCAvifuBeZbPxODF7IvoCSVfssx6981wRQlhd_U21YOFIC7DJLq23wk4w$">
https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md#environment-settings</a> states that SLURM_ things should be filtered out, but that doesn’t appear to be happening?<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">For reference, here’s what mpich-4.1.2 “mpiexec -verbose -launcher slurm” prints:<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">mpiexec options:
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">----------------
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> Base path: /path/to/mpich-4.1.2<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> Launcher: slurm
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> Debug level: 1
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> Enable X: -1
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> Global environment:
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> -------------------
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> SLURM_JOBID=102437
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> SLURM_JOB_USER=eellis
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> SLURM_JOB_QOS=normal
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> SLURM_JOB_NUM_NODES=2
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> SLURM_TASKS_PER_NODE=1(x2)
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> SLURM_TOPOLOGY_ADDR_PATTERN=node<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> … many more SLURM_*<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">And here’s what mpich-3.x prints:<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">mpiexec options:
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">----------------
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> Base path: /path/to/mpich-3.x<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> Launcher: slurm
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> Debug level: 1
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> Enable X: -1
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> Global environment:
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> -------------------
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> ALTERNATE_EDITOR=emacs
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> MAIL=/var/mail/eellis
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> USER=eellis
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> SLURM_JOB_USER=eellis
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> l=/local/eellis
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> XDG_SESSION_TYPE=unspecified
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> SLURM_JOB_QOS=normal<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> … no SLURM_JOBID
<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">Cheers,<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in">Edric.<o:p></o:p></p>
</div>
</body>
</html>