[mpich-discuss] Fwd: mpiexec.hydra creates unexpectable TCP socket.

Anatoly G anatolyrishon at gmail.com
Thu Jan 1 05:35:16 CST 2015


Dear MPICH.
I have an additional information.
This "strange configuration" (hydra connected to computer not from the
list) is result of unhandled Main process fail (similar to abort() call)
without killing children process (hydra).
Thus I can see "init" process becomes a father of hydra process.
Can you please refer me to document explaining hydra behavior when father
process is dead (an emergency situation).
I understand that this situation shouldn't happen and this bug will be
fixed, but I'm curious about the hydra logic.

Regards,
Anatoly.

---------- Forwarded message ----------
From: Anatoly G <anatolyrishon at gmail.com>
Date: Wed, Dec 24, 2014 at 1:00 PM
Subject: mpiexec.hydra creates unexpectable TCP socket.
To: discuss at mpich.org


Dear MPICH.
I'm using mpich 3.1 (hydra+MPI).
I execute main application (Main) which calls mpiexec.hydra in following
way:

mpiexec.hydra -genvall  -disable-auto-cleanup  -f MpiConfigMachines.txt
-launcher=ssh -n 3 MPI_Prog

MpiConfigMachines.txt content:
10.3.2.100:1
10.3.2.101:2

Where 10.3.2.100 is a local host.
As result I get

   - Main + single MPI_Prog processes on local computer
   - 2 MPI_Prog processes on remote one.

Main application establish TCP socket with local MPI_Prog.
Main application establish TCP socket with controller on other computer
10.3.2.170, which is not included in MpiConfigMachines.txt file.

After executing some time (hours, sometimes days) I see via netstat that
created new connection from mpiexec.hydra and controller.

Before executing mpiexec.hydra I set environment variable

setenv MPIEXEC_PORT_RANGE 50010:65535

According to manual this variable limits hydra destination ports to
[50010:65535].


I see that hydra uses these ports with MPI_Prog, but connection with
controller done on port 701 (controller computer).


Controller program is a server. It can accept connections only.


Can you please advice how to stand with this problem?

How hydra recognizes controller IP and establish connection with it?


Sincerely,

Anatoly.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150101/5167a665/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list