[mpich-discuss] Installing MPICH on clusters

Feimi Yu yuf2 at rpi.edu
Fri Sep 17 10:55:32 CDT 2021


Hi,

I'm working on a supercomputer which only provides Spectrum MPI 
implementation in modules. Since our code does not perform well with 
Spectrum MPI I decided to install an MPICH build on our own partition 
(I'm not an administrator.) The supercomputer has a rhel8 system on 
ppc64le architecture with Slurm as the process manager. I tried several 
building options according to the user guide but could not run a job so 
I have a few questions. Here are things I tried:

1. Build with Hydra PM. I could not launch a job with Hydra at all.

2. Then I decided to use ``--with-pm=none`` option to build and use srun 
+ ``mpiexec -f hostfile`` to launch my job. But what confuses me is the 
PMI setting:

srun --mpi=list gives following:

srun: mpi/mpichgm
srun: mpi/mpichmx
srun: mpi/none
srun: mpi/mvapich
srun: mpi/openmpi
srun: mpi/pmi2
srun: mpi/lam
srun: mpi/mpich1_p4
srun: mpi/mpich1_shmem

At first I tried use pmix since I found pmix libraries. But it didn't do 
the trick. It segfaults on PMPI_Init_thread(). The error message is:

/[dcs135:2312190] PMIX ERROR: NOT-FOUND in file client/pmix_client.c at 
line 562/

/Abort(1090831) on node 0 (rank 0 in comm 0): Fatal error in 
PMPI_Init_thread: Other MPI error, error stack://
//MPIR_Init_thread(159): //
//MPID_Init(509).......: //
//MPIR_pmi_init(92)....: PMIX_Init returned -46 //
//[dcs135:2312190:0:2312190] Caught signal 11 (Segmentation fault: 
address not mapped to object at address (nil))//
/

Then I switched to pmi2 but make keeps telling me undefined reference to 
PMI2 library. (actually I couldn't find the pmi2 libraries either.)

Then I used ``--with-pmi=slurm``, and it turned out that I couldn't 
locate the Slurm header files. I guess I don't have the permission to 
access them.

I was wondering if it is still possible for me to build a usable MPICH 
as a user? If yes, how can I do to have the PMI work?

Thanks!

Feimi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20210917/fa4913f0/attachment.html>


More information about the discuss mailing list