[mpich-discuss] Problems running MPICH jobs under SLURM
Markus Geimer
m.geimer at fz-juelich.de
Sun Jun 2 04:31:34 CDT 2013
Hi Pavan,
> Markus: does this happen only with SLURM or can you reproduce this
> without SLURM as well?
It seems to happen only when hydra queries the host list from SLURM.
I tried executing two different setups on the head node, both listing
two compute nodes in a hostfile:
1) mpiexec -f hostfile -n 4 ./hello
Since the SLURM PAM module was disabled, SSH login to the nodes
was possible and the job run as expected, with two ranks on each
node. SLURM's sinfo showed both nodes as state 'idle' and the
HYDRA_DEBUG output said '--rmk user --launcher ssh'.
2) mpiexec -f hostfile -rmk slurm -n 4 ./hello
This job ran as well, with the nodes allocated via SLURM and
shown as 'alloc'. Debug output: '--rmk slurm --launcher slurm'.
Specifying a hostfile with mpiexec within a SLURM batch job also
worked, but that's obviously not what you normally want to do...
Hope this helps. If there is anything else I should try out to help
tracking down the issue, please let me know.
Thanks,
Markus
--
Dr. Markus Geimer
Juelich Supercomputing Centre
Institute for Advanced Simulation
Forschungszentrum Juelich GmbH
52425 Juelich, Germany
Phone: +49-2461-61-1773
Fax: +49-2461-61-6656
E-mail: m.geimer at fz-juelich.de
WWW: http://www.fz-juelich.de/jsc/
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
More information about the discuss
mailing list