[mpich-discuss] MPI_Comm_spawn crosses node boundaries
Raffenetti, Ken
raffenet at anl.gov
Fri Jan 28 10:49:02 CST 2022
Are you using mpiexec or srun when initially launching your job? Hydra (mpiexec) should support the "host" info key, but I'm not sure if srun will.
Ken
On 1/28/22, 10:41 AM, "Mccall, Kurt E. (MSFC-EV41) via discuss" <discuss at mpich.org> wrote:
Hi,
Running MPICH under Slurm, MPI_Comm_spawn unexpectedly creates new processes on any and all of the nodes that Slurm allocates to the job. I would like it to only create new processes locally on the node that called MPI_Comm_spawn.
I’ve tried passing MPI_Comm_spawn an info struct created like this:
MPI_Info info;
MPI_Info_create(&info);
MPI_Info_set(info, "host", host_name);
MPI_Info_set(info, "bind_to", "core");
where hostname = “n001” or even the full name “n001.cluster.pssclabs.com”
but that doesn’t prevent the problem. Any suggestions?
Thanks,
Kurt
More information about the discuss
mailing list