[mpich-discuss] MPI_Comm_spawn crosses node boundaries

Mccall, Kurt E. (MSFC-EV41) kurt.e.mccall at nasa.gov
Fri Jan 28 13:43:02 CST 2022


I have built in the past without the pmi2 option, and MPI_Comm_spawn doesn't work at all.   My memory is a little fuzzy, so I'll try that to make sure.   In the meantime, can you recommend an option with which to build Slurm that MPICH also supports and has tested?   Here is what we have now:

$ srun --mpi=list
srun: MPI types are...
srun: cray_shasta
srun: pmi2
srun: none

-----Original Message-----
From: Raffenetti, Ken <raffenet at anl.gov> 
Sent: Friday, January 28, 2022 1:35 PM
To: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov>; discuss at mpich.org
Subject: [EXTERNAL] Re: [mpich-discuss] MPI_Comm_spawn crosses node boundaries

I'm just now realizing that we don't test spawn functionality with our pmi2 implementation. Can you try rebuilding without that option and see if it works as expected?

Ken

On 1/28/22, 1:18 PM, "Mccall, Kurt E. (MSFC-EV41)" <kurt.e.mccall at nasa.gov> wrote:

    ../mpich-4.0rc3/configure --prefix=/opt/mpich --with-pmi=pmi2/simple --with-device=ch3:nemesis --disable-fortran  -enable-debuginfo --enable-g=debug

    -----Original Message-----
    From: Raffenetti, Ken <raffenet at anl.gov> 
    Sent: Friday, January 28, 2022 1:16 PM
    To: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov>; discuss at mpich.org
    Subject: [EXTERNAL] Re: [mpich-discuss] MPI_Comm_spawn crosses node boundaries

    From what I can tell, the info keyvals are not being sent to the process manager, which would explain what you are seeing. Need to investigate why that is happening next. What is your MPICH ./configure line? That'll help narrow where to look.

    Ken

    On 1/28/22, 12:49 PM, "Mccall, Kurt E. (MSFC-EV41)" <kurt.e.mccall at nasa.gov> wrote:

        Ken, 

        There is a lot of my own output mixed up in the "mpiexec -v" output.   I hope you can make sense of this.

        Thanks,
        Kurt

        -----Original Message-----
        From: Raffenetti, Ken <raffenet at anl.gov> 
        Sent: Friday, January 28, 2022 12:36 PM
        To: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov>; discuss at mpich.org
        Subject: [EXTERNAL] Re: [mpich-discuss] MPI_Comm_spawn crosses node boundaries

        "ip_address" won't be recognized, only "host", "hosts", or "hostfile". Could you run an example using "mpiexec -v" and capture/share the output? That should help tell us if the hostname information is being fed correctly to the process manager by the spawn command.

        Ken

        On 1/28/22, 11:35 AM, "Mccall, Kurt E. (MSFC-EV41)" <kurt.e.mccall at nasa.gov> wrote:

            Ken,

            I'm using sbatch, which calls a bash script that calls mpiexec (4.0rc3).   Which host name convention is correct, the short or the long host name?   Would the "ip_address" info key work?

            Kurt

            -----Original Message-----
            From: Raffenetti, Ken <raffenet at anl.gov> 
            Sent: Friday, January 28, 2022 10:49 AM
            To: discuss at mpich.org
            Cc: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov>
            Subject: [EXTERNAL] Re: [mpich-discuss] MPI_Comm_spawn crosses node boundaries

            Are you using mpiexec or srun when initially launching your job? Hydra (mpiexec) should support the "host" info key, but I'm not sure if srun will.

            Ken

            On 1/28/22, 10:41 AM, "Mccall, Kurt E. (MSFC-EV41) via discuss" <discuss at mpich.org> wrote:

                Hi,

                Running MPICH under Slurm, MPI_Comm_spawn unexpectedly creates new processes on any and all of the nodes that Slurm allocates to the job.   I would like it to only create new processes locally on the node that called MPI_Comm_spawn.

                I’ve tried passing MPI_Comm_spawn an info struct created like this:

                        MPI_Info info;
                        MPI_Info_create(&info);
                        MPI_Info_set(info, "host", host_name);
                        MPI_Info_set(info, "bind_to", "core");

                where hostname = “n001” or even the full name “n001.cluster.pssclabs.com”

                but that doesn’t prevent the problem.  Any suggestions?

                Thanks,
                Kurt






More information about the discuss mailing list