[mpich-discuss] MPICH Hydra mpiexec and Slurm job allocation

Raffenetti, Kenneth J. raffenet at mcs.anl.gov
Thu Dec 5 08:34:57 CST 2019


On 12/5/19 6:23 AM, Stefan via discuss wrote:
> Tested with nightly mpich-master-v3.4a1-405-gcb944ba and it works indeed,
> great! I've compiled 3.3.2 with the slurm_query_node_list.c file from
> master, and that seems to work fine too. Hope this makes it into the next
> 3.3 patch release.
> 
> I've noticed that both the nodelist detection and slurm launcher work
> fine even without specifying the Slurm installation directory (Slurm is
> installed _outside_ of /usr and LD_LIBRARY_PATH is empty when I compile).
> 
> Is there any benefit of using --with-slurm=/path/to/slurm ?
> 
> 
> I have another vaguely related question. When I configure with
>    --with-pmix=/sw/pmix/2.2.3
> it seems like Hydra is silently disabled. I don't get any warning or
> other message even when I explicitly specify
>    --with-pm=hydra
> there is no mpiexec in the bin folder after make install.
> 
> Is there a way to use PMIx and Hydra, or is this planned for the future?

No, Hydra does not support PMIx. We do not have plans to support PMIx 
with Hydra in the near future.

If Slurm on your cluster is configured with PMIx support, you should be 
able to launch MPICH+PMIx applications with srun instead of mpiexec.

Ken

> 
> /Stefan
> 
> On Wed, 4 Dec 2019 23:05:46 +0000
> "Congiu, Giuseppe" <gcongiu at anl.gov> wrote:
> 
>> That is a bug that should be now fixed in current master. Additionally we added a short test program in `src/pm/hydra/maint` (slurm_nodelist_parse) that you can use to test your node list format to see whether it will cause problems with MPICH. If that fails please provide the output of such program in your message and we will look into it.
>>
>> Giuseppe Congiu
>> Postdoctoral Appointee
>> MCS Division
>> Argonne National Laboratory
>> 9700 South Cass Ave., Lemont, IL 60439
>>
>>
>>
>> On Dec 4, 2019, at 4:28 PM, Stefan via discuss <discuss at mpich.org<mailto:discuss at mpich.org>> wrote:
>>
>> Hi,
>>
>> I'm having some issues to make mpirun/mpiexec play nicely with Slurm
>> allocations. I'm using Slurm 19.05.4, and have configured MPICH with:
>> --enable-shared --enable-static --with-slurm=/sw/slurm/19.05.4 \
>> --with-pm=hydra
>>
>> Now I request resources from Slurm with:
>> $ salloc -N 2 --ntasks-per-node 4
>>
>> Then when I try to run a test binary:
>> $ mpiexec.hydra ./mpich_hello
>> Error: node list format not recognized. Try using '-hosts=<hostnames>'.
>> Aborted (core dumped)
>>
>> When I do the same with OpenMPI's mpirun/mpiexec it runs on the allocated
>> nodes. Am I missing something, or does MPICH simply not support this use case?
>>
>> Currently I'm working around this by using a script to translate Slurm
>> node allocations into a host list and run like this:
>> $ mpiexec.hydra -hosts $(mpich-host) ./mpich_hello
>>
>> That works fine, but I suppose this workaround should not be necessary.
>> Here is ltrace output which shows that mpiexec tries to process some Slurm
>> related environment variables but apparently fails to do so:
>> https://paste.ubuntu.com/p/327tGrTzq5/
>>
>> I've also tried with salloc -N 1 -n 1, so that the environment variables
>> are simpler, e.g.
>> SLURM_NODELIST=node-b01
>> SLURM_TASKS_PER_NODE=1
>> but that did not change the way mpiexec fails.
>>
>> /Stefan
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org<mailto:discuss at mpich.org>
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
> 


More information about the discuss mailing list