[mpich-discuss] MPICH Hydra mpiexec and Slurm job allocation

Congiu, Giuseppe gcongiu at anl.gov
Wed Dec 4 17:10:29 CST 2019


Just an example of how that works:

$ ./slurm_nodelist_parse
input compressed nodelist: node-b[01-10]
expanded nodelist: node-b01,node-b02,node-b03,node-b04,node-b05,node-b06,node-b07,node-b08,node-b09,node-b10

Thus your node list format should be recognized currently

Giuseppe Congiu
Postdoctoral Appointee
MCS Division
Argonne National Laboratory
9700 South Cass Ave., Lemont, IL 60439



On Dec 4, 2019, at 5:05 PM, Congiu, Giuseppe via discuss <discuss at mpich.org<mailto:discuss at mpich.org>> wrote:

That is a bug that should be now fixed in current master. Additionally we added a short test program in `src/pm/hydra/maint` (slurm_nodelist_parse) that you can use to test your node list format to see whether it will cause problems with MPICH. If that fails please provide the output of such program in your message and we will look into it.

Giuseppe Congiu
Postdoctoral Appointee
MCS Division
Argonne National Laboratory
9700 South Cass Ave., Lemont, IL 60439



On Dec 4, 2019, at 4:28 PM, Stefan via discuss <discuss at mpich.org<mailto:discuss at mpich.org>> wrote:

Hi,

I'm having some issues to make mpirun/mpiexec play nicely with Slurm
allocations. I'm using Slurm 19.05.4, and have configured MPICH with:
--enable-shared --enable-static --with-slurm=/sw/slurm/19.05.4 \
--with-pm=hydra

Now I request resources from Slurm with:
$ salloc -N 2 --ntasks-per-node 4

Then when I try to run a test binary:
$ mpiexec.hydra ./mpich_hello
Error: node list format not recognized. Try using '-hosts=<hostnames>'.
Aborted (core dumped)

When I do the same with OpenMPI's mpirun/mpiexec it runs on the allocated
nodes. Am I missing something, or does MPICH simply not support this use case?

Currently I'm working around this by using a script to translate Slurm
node allocations into a host list and run like this:
$ mpiexec.hydra -hosts $(mpich-host) ./mpich_hello

That works fine, but I suppose this workaround should not be necessary.
Here is ltrace output which shows that mpiexec tries to process some Slurm
related environment variables but apparently fails to do so:
https://paste.ubuntu.com/p/327tGrTzq5/

I've also tried with salloc -N 1 -n 1, so that the environment variables
are simpler, e.g.
SLURM_NODELIST=node-b01
SLURM_TASKS_PER_NODE=1
but that did not change the way mpiexec fails.

/Stefan
_______________________________________________
discuss mailing list     discuss at mpich.org<mailto:discuss at mpich.org>
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list     discuss at mpich.org<mailto:discuss at mpich.org>
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20191204/1372d643/attachment-0001.html>


More information about the discuss mailing list