<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
Hi,
<div class=""><br class="">
</div>
<div class="">As noted, I had already tried MPIEXEC_PORT_RANGE with no difference.</div>
<div class=""><br class="">
</div>
<div class="">I don’t know if it’s normal for blaunch to just have its own ports and this is nothing to do with mpich or mpiexec at all, but the follow on issue is if I ssh to one of the other nodes, nothing is listening on the control port. I only see a process
 listening on one of these other ports outside my specified range. Which I think might be related to my fundamental problem of mpiexec failing to do anything under LSF most of the time at high host counts (see my previous thread on this list).</div>
<div class=""><br class="">
</div>
<div class="">
<div><br class="">
<blockquote type="cite" class="">
<div class="">On 9 Mar 2021, at 16:21, Zhou, Hui <<a href="mailto:zhouh@anl.gov" class="">zhouh@anl.gov</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class="WordSection1" style="page: WordSection1; caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
<div style="margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class="">
Hi Sendu,<o:p class=""></o:p></div>
<div style="margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class="">
<o:p class=""> </o:p></div>
<div style="margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class="">
Could you try `MPIEXEC_PORT_RANGE` in stead?<o:p class=""></o:p></div>
<div style="margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class="">
<o:p class=""> </o:p></div>
<div style="margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class="">
I know it is confusing and the documentation probably need update/correction, but `MPICH_PORT_RANGE` is for `MPICH` rather than `MPIEXEC`.<o:p class=""></o:p></div>
<div style="margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class="">
<o:p class=""> </o:p></div>
<div class="">
<div class="">
<div class="">
<div style="margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class="">
-- <br class="">
Hui Zhou<o:p class=""></o:p></div>
</div>
</div>
</div>
<div style="margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class="">
<o:p class=""> </o:p></div>
<div style="margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class="">
<o:p class=""> </o:p></div>
<div style="border-style: solid none none; border-top-width: 1pt; border-top-color: rgb(181, 196, 223); padding: 3pt 0in 0in;" class="">
<p class="MsoNormal" style="margin: 0in 0in 12pt 0.5in; font-size: 11pt; font-family: Calibri, sans-serif;">
<b class=""><span style="font-size: 12pt;" class="">From:<span class="Apple-converted-space"> </span></span></b><span style="font-size: 12pt;" class="">Sendu Bala via discuss <<a href="mailto:discuss@mpich.org" class="">discuss@mpich.org</a>><br class="">
<b class="">Date:<span class="Apple-converted-space"> </span></b>Tuesday, March 9, 2021 at 7:16 AM<br class="">
<b class="">To:<span class="Apple-converted-space"> </span></b><a href="mailto:discuss@mpich.org" class="">discuss@mpich.org</a> <<a href="mailto:discuss@mpich.org" class="">discuss@mpich.org</a>><br class="">
<b class="">Cc:<span class="Apple-converted-space"> </span></b>Sendu Bala <<a href="mailto:sb10@sanger.ac.uk" class="">sb10@sanger.ac.uk</a>><br class="">
<b class="">Subject:<span class="Apple-converted-space"> </span></b>[mpich-discuss] How to set port range used under LSF?<o:p class=""></o:p></span></p>
</div>
<div class="">
<div style="margin: 0in 0in 0in 0.5in; font-size: 11pt; font-family: Calibri, sans-serif;" class="">
Via a bsub, I’m doing:<br class="">
<br class="">
MPICH_PORT_RANGE="46107:46140” mpiexec mpich/examples/cpi<br class="">
<br class="">
When I ssh to the controlling node, I see it has spawned a set of blaunch processes with `--control-port node-12-3-2:46107` as expected, but:<br class="">
<br class="">
ss -l -p -n | grep blaunch<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:34361            0.0.0.0:*                                                                                users:(("blaunch",pid=2823,fd=7))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:46107            0.0.0.0:*                                                                                users:(("cpi",pid=2839,fd=5),("blaunch",pid=2837,fd=5),("blaunch",pid=2836,fd=5),("blaunch",pid=2835,fd=5),("blaunch",pid=2834,fd=5),("blaunch",pid=2833,fd=5),("blaunch",pid=2832,fd=5),("blaunch",pid=2831,fd=5),("blaunch",pid=2830,fd=5),("blaunch",pid=2829,fd=5),("blaunch",pid=2828,fd=5),("blaunch",pid=2827,fd=5),("blaunch",pid=2826,fd=5),("blaunch",pid=2825,fd=5),("blaunch",pid=2824,fd=5),("blaunch",pid=2823,fd=5),("hydra_pmi_proxy",pid=2822,fd=5),("mpiexec",pid=2821,fd=5))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:43741            0.0.0.0:*                                                                                users:(("blaunch",pid=2825,fd=12))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:41983            0.0.0.0:*                                                                                users:(("blaunch",pid=2830,fd=22))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:41215            0.0.0.0:*                                                                                users:(("blaunch",pid=2832,fd=26))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:34433            0.0.0.0:*                                                                                users:(("blaunch",pid=2831,fd=24))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:33219            0.0.0.0:*                                                                                users:(("blaunch",pid=2827,fd=16))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:34405            0.0.0.0:*                                                                                users:(("blaunch",pid=2837,fd=36))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:43465            0.0.0.0:*                                                                                users:(("blaunch",pid=2836,fd=34))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:39755            0.0.0.0:*                                                                                users:(("blaunch",pid=2833,fd=28))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:38095            0.0.0.0:*                                                                                users:(("blaunch",pid=2829,fd=20))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:44625            0.0.0.0:*                                                                                users:(("blaunch",pid=2834,fd=30))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:35345            0.0.0.0:*                                                                                users:(("blaunch",pid=2835,fd=32))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:43827            0.0.0.0:*                                                                                users:(("blaunch",pid=2826,fd=14))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:40915            0.0.0.0:*                                                                                users:(("blaunch",pid=2828,fd=18))<br class="">
tcp               LISTEN              0                    128                                                                 0.0.0.0:42549            0.0.0.0:*                                                                                users:(("blaunch",pid=2824,fd=9))<br class="">
<br class="">
Why are these all listening on ports outside my range? I’ve also tried setting MPIEXEC_PORT_RANGE and MPIR_CVAR_CH3_PORT_RANGE and still have the problem.<br class="">
<br class="">
Is there any way to fully control the ports used?<br class="">
<br class="">
<br class="">
Cheers,<br class="">
Sendu.<br class="">
<br class="">
<br class="">
<br class="">
<br class="">
--<span class="Apple-converted-space"> </span><br class="">
 The Wellcome Sanger Institute is operated by Genome Research<span class="Apple-converted-space"> </span><br class="">
 Limited, a charity registered in England with number 1021457 and a<span class="Apple-converted-space"> </span><br class="">
 company registered in England with number 2742969, whose registered<span class="Apple-converted-space"> </span><br class="">
 office is 215 Euston Road, London, NW1 2BE.<br class="">
_______________________________________________<br class="">
discuss mailing list    <span class="Apple-converted-space"> </span><a href="mailto:discuss@mpich.org" style="color: blue; text-decoration: underline;" class="">discuss@mpich.org</a><br class="">
To manage subscription options or unsubscribe:<br class="">
<a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.mpich.org_mailman_listinfo_discuss&d=DwMF-g&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=R4ZUzQZ7_TZ1SVV_pAmysrrJ1zatMHFpzMNAdJSpPIo&m=CuEiWIvEL-XMgdluJKtxPAWm-NPb8DmcQkMrblVojU0&s=PWAs4LCPVtlyPby567cPJCzNtS471MUxw-iuL_53ZlY&e=" style="color: blue; text-decoration: underline;" class="">https://lists.mpich.org/mailman/listinfo/discuss
 [lists.mpich.org]</a></div>
</div>
</div>
</div>
</blockquote>
</div>
<br class="">
</div>



-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

</body>
</html>