[mpich-discuss] Hydra WARNING: too many ssh connections

Zhou, Hui zhouh at anl.gov
Fri Apr 1 16:19:32 CDT 2022


Every time you call MPI_Comm_spawn, hydra will launch a ssh (for each host) to create a proxy. It is certainly not ideal for applications relying on spawning many processes.
________________________________
From: Mccall, Kurt E. (MSFC-EV41) via discuss <discuss at mpich.org>
Sent: Friday, April 1, 2022 4:08 PM
To: discuss at mpich.org <discuss at mpich.org>
Cc: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov>
Subject: [mpich-discuss] Hydra WARNING: too many ssh connections


Hi,  you provided the following information about the warning “too many ssh connections”:



The particular warning is issued by hydra, MPICH’s process manager. Following excerpt is the comment in that source code:



        /* ssh has many types of security controls that do not allow a

         * user to ssh to the same node multiple times very

         * quickly. If this happens, the ssh daemons disables ssh

         * connections causing the job to fail. This is basically a

         * hack to slow down ssh connections to the same node. We

         * check for offset == 0 before applying this hack, so we only

         * slow down the cases where ssh is being used, and not the

         * cases where we fall back to fork. */



Is this just during an initial ssh connection attempt?  I’m trying to figure out where my code is triggering this warning.  Could it be from



  1.  MPI_Intercomm_create
  2.  MPI_Comm_spawn
  3.  others?



I’m calling mpiexec with the “—launcher ssh” option, MPICH 4.0.1.



Thanks,

Kurt




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20220401/c56488f9/attachment.html>


More information about the discuss mailing list