[mpich-discuss] MPIExec.hydra allocates too many CPU's on SLURM

Kenneth Raffenetti raffenet at mcs.anl.gov
Thu Apr 10 07:24:54 CDT 2014


What version of mpich are you using? Why are you specifying -hosts in 
your mpiexec command? Are you allocating resources in some other way 
prior to launching your job?

You are correct about the normal behavior of hydra. It launches a 
proxies on all nodes which then launch the MPI processes.

On 04/10/2014 06:49 AM, Ruben Faelens wrote:
> Hello list,
>
> I have been messing around with SLURM and MPICH2 for about a week now,
> in order to run NONMEM on my cluster.
>
> I start my job using the following command line:
>  > mpiexec.hydra -hosts node2-a,node2-b -bootstrap slurm -rmk slurm -n 1
> ./master : -n 7 ./slave
>
> Using pstree, I can see that this launches the following command:
>  > srun -N 4 -n 4 /usr/local/bin/hydra_pmi_proxy --control-port
> node2-head:38860 --rmk slurm --launcher slurm --demux poll --pgid 0
> --retries 10 --usize -2 --proxy-id -1
>
> Apparently, Hydra launches one single 'hydra_pmi_proxy' per node, after
> which the hydra_pmi_proxy launches the other processes. This completely
> screws up allocation rules in SLURM.
>
> Is this normal behaviour? I would rather have mpiexec.hydra allocate the
> right number of resources on SLURM, instead of seemingly deciding on its
> own which hosts to use.
>
> Kind regards,
> Ruben FAELENS
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>



More information about the discuss mailing list