[mpich-discuss] How to set port range used under LSF?

Zhou, Hui zhouh at anl.gov
Tue Mar 9 10:21:26 CST 2021


Hi Sendu,

Could you try `MPIEXEC_PORT_RANGE` in stead?

I know it is confusing and the documentation probably need update/correction, but `MPICH_PORT_RANGE` is for `MPICH` rather than `MPIEXEC`.

--
Hui Zhou


From: Sendu Bala via discuss <discuss at mpich.org>
Date: Tuesday, March 9, 2021 at 7:16 AM
To: discuss at mpich.org <discuss at mpich.org>
Cc: Sendu Bala <sb10 at sanger.ac.uk>
Subject: [mpich-discuss] How to set port range used under LSF?
Via a bsub, I’m doing:

MPICH_PORT_RANGE="46107:46140” mpiexec mpich/examples/cpi

When I ssh to the controlling node, I see it has spawned a set of blaunch processes with `--control-port node-12-3-2:46107` as expected, but:

ss -l -p -n | grep blaunch
tcp               LISTEN              0                    128                                                                 0.0.0.0:34361            0.0.0.0:*                                                                                users:(("blaunch",pid=2823,fd=7))
tcp               LISTEN              0                    128                                                                 0.0.0.0:46107            0.0.0.0:*                                                                                users:(("cpi",pid=2839,fd=5),("blaunch",pid=2837,fd=5),("blaunch",pid=2836,fd=5),("blaunch",pid=2835,fd=5),("blaunch",pid=2834,fd=5),("blaunch",pid=2833,fd=5),("blaunch",pid=2832,fd=5),("blaunch",pid=2831,fd=5),("blaunch",pid=2830,fd=5),("blaunch",pid=2829,fd=5),("blaunch",pid=2828,fd=5),("blaunch",pid=2827,fd=5),("blaunch",pid=2826,fd=5),("blaunch",pid=2825,fd=5),("blaunch",pid=2824,fd=5),("blaunch",pid=2823,fd=5),("hydra_pmi_proxy",pid=2822,fd=5),("mpiexec",pid=2821,fd=5))
tcp               LISTEN              0                    128                                                                 0.0.0.0:43741            0.0.0.0:*                                                                                users:(("blaunch",pid=2825,fd=12))
tcp               LISTEN              0                    128                                                                 0.0.0.0:41983            0.0.0.0:*                                                                                users:(("blaunch",pid=2830,fd=22))
tcp               LISTEN              0                    128                                                                 0.0.0.0:41215            0.0.0.0:*                                                                                users:(("blaunch",pid=2832,fd=26))
tcp               LISTEN              0                    128                                                                 0.0.0.0:34433            0.0.0.0:*                                                                                users:(("blaunch",pid=2831,fd=24))
tcp               LISTEN              0                    128                                                                 0.0.0.0:33219            0.0.0.0:*                                                                                users:(("blaunch",pid=2827,fd=16))
tcp               LISTEN              0                    128                                                                 0.0.0.0:34405            0.0.0.0:*                                                                                users:(("blaunch",pid=2837,fd=36))
tcp               LISTEN              0                    128                                                                 0.0.0.0:43465            0.0.0.0:*                                                                                users:(("blaunch",pid=2836,fd=34))
tcp               LISTEN              0                    128                                                                 0.0.0.0:39755            0.0.0.0:*                                                                                users:(("blaunch",pid=2833,fd=28))
tcp               LISTEN              0                    128                                                                 0.0.0.0:38095            0.0.0.0:*                                                                                users:(("blaunch",pid=2829,fd=20))
tcp               LISTEN              0                    128                                                                 0.0.0.0:44625            0.0.0.0:*                                                                                users:(("blaunch",pid=2834,fd=30))
tcp               LISTEN              0                    128                                                                 0.0.0.0:35345            0.0.0.0:*                                                                                users:(("blaunch",pid=2835,fd=32))
tcp               LISTEN              0                    128                                                                 0.0.0.0:43827            0.0.0.0:*                                                                                users:(("blaunch",pid=2826,fd=14))
tcp               LISTEN              0                    128                                                                 0.0.0.0:40915            0.0.0.0:*                                                                                users:(("blaunch",pid=2828,fd=18))
tcp               LISTEN              0                    128                                                                 0.0.0.0:42549            0.0.0.0:*                                                                                users:(("blaunch",pid=2824,fd=9))

Why are these all listening on ports outside my range? I’ve also tried setting MPIEXEC_PORT_RANGE and MPIR_CVAR_CH3_PORT_RANGE and still have the problem.

Is there any way to fully control the ports used?


Cheers,
Sendu.




--
 The Wellcome Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20210309/5a6f7cf1/attachment-0001.html>


More information about the discuss mailing list