[mpich-discuss] Hydra WARNING: too many ssh connections

Mccall, Kurt E. (MSFC-EV41) kurt.e.mccall at nasa.gov
Fri Apr 1 16:22:39 CDT 2022


Thanks Hui,  is the spawned process on the local host, or the remote host or both?

Kurt

From: Zhou, Hui <zhouh at anl.gov>
Sent: Friday, April 1, 2022 4:20 PM
To: discuss at mpich.org
Cc: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov>
Subject: [EXTERNAL] Re: Hydra WARNING: too many ssh connections

Every time you call MPI_Comm_spawn, hydra will launch a ssh (for each host) to create a proxy. It is certainly not ideal for applications relying on spawning many processes.
________________________________
From: Mccall, Kurt E. (MSFC-EV41) via discuss <discuss at mpich.org<mailto:discuss at mpich.org>>
Sent: Friday, April 1, 2022 4:08 PM
To: discuss at mpich.org<mailto:discuss at mpich.org> <discuss at mpich.org<mailto:discuss at mpich.org>>
Cc: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov<mailto:kurt.e.mccall at nasa.gov>>
Subject: [mpich-discuss] Hydra WARNING: too many ssh connections


Hi,  you provided the following information about the warning "too many ssh connections":



The particular warning is issued by hydra, MPICH's process manager. Following excerpt is the comment in that source code:



        /* ssh has many types of security controls that do not allow a

         * user to ssh to the same node multiple times very

         * quickly. If this happens, the ssh daemons disables ssh

         * connections causing the job to fail. This is basically a

         * hack to slow down ssh connections to the same node. We

         * check for offset == 0 before applying this hack, so we only

         * slow down the cases where ssh is being used, and not the

         * cases where we fall back to fork. */



Is this just during an initial ssh connection attempt?  I'm trying to figure out where my code is triggering this warning.  Could it be from



  1.  MPI_Intercomm_create
  2.  MPI_Comm_spawn
  3.  others?



I'm calling mpiexec with the "-launcher ssh" option, MPICH 4.0.1.



Thanks,

Kurt




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20220401/8da520aa/attachment-0001.html>


More information about the discuss mailing list