[mpich-discuss] MPI_Comm_spawn crosses node boundaries

Raffenetti, Ken raffenet at anl.gov
Fri Feb 4 11:02:19 CST 2022


When running with srun you need to use the Slurm PMI library, not the embedded Simple PMI2 library. Simple PMI2 is API compatible, but uses a different wire protocol that the Slurm implementation. Try this instead:

  configure --with-slurm=/opt/slurm --with-pmi=slurm

This will link the Slurm PMI library to MPICH. I do acknowledge how confusing this must be to users :). Probably a good FAQ topic for our Github discussions page.

Ken

On 2/3/22, 7:00 PM, "Mccall, Kurt E. (MSFC-EV41)" <kurt.e.mccall at nasa.gov> wrote:

    Ken,

    I'm trying to build MPICH 4.0 in several ways, one of which will be what you suggested below.   For this particular attempt suggested by the Slurm MPI guide, I built it with

    configure --with-slurm=/opt/slurm --with-pmi=pmi2/simple <etc>

    and invoked it with

    srun --mpi=pmi2 <etc>

    The job is crashing with this message.   Any idea what is wrong?

    slurmstepd: error: mpi/pmi2: no value for key  in req
    slurmstepd: error: mpi/pmi2: no value for key  in req
    slurmstepd: error: mpi/pmi2: no value for key <99>è­þ^? in req
    slurmstepd: error: mpi/pmi2: no value for key  in req
    slurmstepd: error: mpi/pmi2: no value for key  in req
    slurmstepd: error: mpi/pmi2: no value for key ´2¾ÿ^? in req
    slurmstepd: error: mpi/pmi2: no value for key ; in req
    slurmstepd: error: mpi/pmi2: no value for key  in req
    slurmstepd: error: *** STEP 52227.0 ON n001 CANCELLED AT 2022-02-03T18:48:02 ***

    -----Original Message-----
    From: Raffenetti, Ken <raffenet at anl.gov> 
    Sent: Friday, January 28, 2022 3:15 PM
    To: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov>; discuss at mpich.org
    Subject: [EXTERNAL] Re: [mpich-discuss] MPI_Comm_spawn crosses node boundaries

    On 1/28/22, 2:22 PM, "Mccall, Kurt E. (MSFC-EV41)" <kurt.e.mccall at nasa.gov> wrote:

        Ken,

        I confirmed that MPI_Comm_spawn fails completely if I build MPICH without the PMI2 option.

    Dang, I thought that would work :(.

        Looking at the Slurm documentation https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fmpi_guide.html%23intel_mpiexec_hydra&data=04%7C01%7Ckurt.e.mccall%40nasa.gov%7Cd88808a06c294db7ab2a08d9e2a33ce3%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637790012994134895%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ulhkFWJoO%2BOccQNAOggrTOgdtO6PDBHtv1b5lJ1p2ms%3D&reserved=0
        it states  "All MPI_comm_spawn work fine now going through hydra's PMI 1.1 interface."   The full quote is below for reference.

        1) how do I build MPICH to support hydra's PMI 1.1 interface?

    That is the default, so no extra configuration should be needed. One thing I notice in your log output is that the Slurm envvars seems to have changed name from what we have in our source. E.g. SLURM_JOB_NODELIST vs. SLURM_NODELIST. Do your initial processes launch on the right nodes?

        2) Can you offer any guesses on how to build Slurm to do the same?  (I realize this isn't a Slurm forum  😊)

    Hopefully you don't need to rebuild Slurm to do it. What you could try is configuring the Slurm PMI library when building MPICH. Add "--with-pm=none --with-pmi=slurm --with-slurm=<path/to/install>". Then use srun instead of mpiexec and see how it goes.

    Ken




More information about the discuss mailing list