[mpich-discuss] A few inconveniences with MPICH 3.3 build
Martin Cuma
martin.cuma at utah.edu
Tue Feb 12 15:36:03 CST 2019
Hi everyone,
I was wondering if someone could give me a feedback on a couple of quirks
I found with MPICH 3.3, that I did not observe with 3.2.1.
1. If CUDA is found on a machine where MPICH is built,
/lib64/libXNVCtrl.so.0 gets included in the MPICH library (libmpi.so). I
tried the --without-x or --with-x=no options but it does not affect this.
If I build on a machine that does not have CUDA, /lib64/libXNVCtrl.so.0 is
not included and the library seems to work the same (on typical MPI
applications).
This is annoying if we want to have a common MPICH for machines with and
without CUDA installed, e.g. compute node with and without GPUs.
2. It seems that MPICH now requires the --with-slurm option to pick up the
SLURM hostlist (= run w/o the -machinefile option). This was not the case
with 3.2.1, where mpich picked up the SLURM hostlist without needing to be
built with --with-slurm. The path to SLURM also gets encoded in the RPATH
of mpirun and friends. I'd rather control that myself if possible.
This is an inconvenience for us as well since we have different clusters
with SLURMs on different file systems/paths, and I'd like to also use the
same MPICH build on desktops that don't have SLURM at all. I could put
$ORIGIN to RPATH and stick that libslurm.so to the MPICH bin directory,
since we tend to run the same SLURM version on all the clusters, but,
still, it's clunky.
Again, appreciate if someone could put some light to this or give some
good workarounds.
Thanks,
MC
--
Martin Cuma
Center for High Performance Computing
Department of Geology and Geophysics
University of Utah
More information about the discuss
mailing list