<div dir="ltr">Hello list,<div><br></div><div>I have been messing around with SLURM and MPICH2 for about a week now, in order to run NONMEM on my cluster.</div><div><br></div><div>I start my job using the following command line:</div>
<div>> mpiexec.hydra -hosts node2-a,node2-b -bootstrap slurm -rmk slurm -n 1 ./master : -n 7 ./slave</div><div><br></div><div>Using pstree, I can see that this launches the following command:</div><div>> srun -N 4 -n 4 /usr/local/bin/hydra_pmi_proxy --control-port node2-head:38860 --rmk slurm --launcher slurm --demux poll --pgid 0 --retries 10 --usize -2 --proxy-id -1</div>
<div><br></div><div>Apparently, Hydra launches one single 'hydra_pmi_proxy' per node, after which the hydra_pmi_proxy launches the other processes. This completely screws up allocation rules in SLURM.</div><div><br>
</div><div>Is this normal behaviour? I would rather have mpiexec.hydra allocate the right number of resources on SLURM, instead of seemingly deciding on its own which hosts to use.</div><div><div><br></div><div>Kind regards,</div>
<div>Ruben FAELENS</div><div></div>
</div></div>