[mpich-discuss] Environment variable forwarding using Hydra and "-launcher ssh"
Edric Ellis
eellis at mathworks.com
Mon Apr 22 09:44:10 CDT 2024
We're in the process of moving from mpich-3.x to mpich-4.1.2. We've run into some odd behaviour on SLURM related to environment variable forwarding by mpiexec. It looks like mpiexec now propagates only SLURM_* environment variables, instead of filtering them out (or intending to). Consider something like this:
$ mpiexec -launcher slurm printenv HOME SLURM_JOBID
Using mpich-3.x, the HOME variable gets forward. Using mpich-4.1.2, it does not. I believe that mpich-3.x intends to filter out SLURM_JOBID, but the value still seems to be present, maybe srun forwards that. It's the fact that HOME doesn't get through using mpich-4.1.2 that is causing us problems.
Running mpich-
Here's what I think is the relevant change for SLURM: https://urldefense.us/v3/__https://github.com/pmodels/mpich/commit/95ba4ddc7efc7ddc7f25ed41480ee35248184680__;!!G_uCfscf7eWS!brQm1StWngU3EbSpC0Df2zQCAvifuBeZbPxODF7IvoCSVfssx6981wRQlhd_U21YOFIC7DJL8npL9gU$ . Am I reading that correctly?
The doc here https://urldefense.us/v3/__https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md*environment-settings__;Iw!!G_uCfscf7eWS!brQm1StWngU3EbSpC0Df2zQCAvifuBeZbPxODF7IvoCSVfssx6981wRQlhd_U21YOFIC7DJLq23wk4w$ states that SLURM_ things should be filtered out, but that doesn't appear to be happening?
For reference, here's what mpich-4.1.2 "mpiexec -verbose -launcher slurm" prints:
mpiexec options:
----------------
Base path: /path/to/mpich-4.1.2
Launcher: slurm
Debug level: 1
Enable X: -1
Global environment:
-------------------
SLURM_JOBID=102437
SLURM_JOB_USER=eellis
SLURM_JOB_QOS=normal
SLURM_JOB_NUM_NODES=2
SLURM_TASKS_PER_NODE=1(x2)
SLURM_TOPOLOGY_ADDR_PATTERN=node
... many more SLURM_*
And here's what mpich-3.x prints:
mpiexec options:
----------------
Base path: /path/to/mpich-3.x
Launcher: slurm
Debug level: 1
Enable X: -1
Global environment:
-------------------
ALTERNATE_EDITOR=emacs
MAIL=/var/mail/eellis
USER=eellis
SLURM_JOB_USER=eellis
l=/local/eellis
XDG_SESSION_TYPE=unspecified
SLURM_JOB_QOS=normal
... no SLURM_JOBID
Cheers,
Edric.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20240422/e6a54eaf/attachment-0001.html>
More information about the discuss
mailing list