<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
Hi All</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
<br>
</div>
<div><font face="Calibri,sans-serif">We’re running MPICH on a couple machines with a brand new UNIX distro (SL 6.5) and that are on a vulnerable network and rather than leave the firewalls dropped we w</font><font face="Calibri,sans-serif">ould like to run
it through the firewall. </font></div>
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
We have included the MPIEXEC_PORT_RANGE and MPIR_CVAR_CH3_PORT_RANGE fields and </div>
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
have adjusted our iptables accordingly and in line with the “FAQ” guidance.</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
Our passwordless SSH works fine between the machines.</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
But all of this gives us momentary success with the cpi and fpi MPICH test programs. But they crash with the firewall up. (but of course run happily with the firewall down).</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
An example of the basic output is below (node short sends one process to “this.machine” and one to remote “that.machine”</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif;">
<p style="margin: 0px; font-family: 'Lucida Console';">[this.machine]% mpiexec -n 2 -f nodesshort cpi.exe</p>
<p style="margin: 0px; font-family: 'Lucida Console';">Process 0 of 2 is on this.machine</p>
<p style="margin: 0px; font-family: 'Lucida Console';">Process 1 of 2 is on that.machine </p>
<p style="margin: 0px; font-family: 'Lucida Console';">Fatal error in PMPI_Reduce: A process has failed, error stack:</p>
<p style="margin: 0px; font-family: 'Lucida Console';">PMPI_Reduce(1217)...............: MPI_Reduce(sbuf=0x7fff466a94d0, rbuf=0x7fff466a94d8, count=1, MPI_DOUBLE, MPI_SUM, root=0, MPI_COMM_WORLD) failed</p>
<p style="margin: 0px; font-family: 'Lucida Console';">MPIR_Reduce_impl(1029)..........: </p>
<p style="margin: 0px; font-family: 'Lucida Console';">MPIR_Reduce_intra(835)..........: </p>
<p style="margin: 0px; font-family: 'Lucida Console';">MPIR_Reduce_binomial(144).......: </p>
<p style="margin: 0px; font-family: 'Lucida Console';">MPIDI_CH3U_Recvq_FDU_or_AEP(667): Communication error with rank 1</p>
<p style="margin: 0px; font-family: 'Lucida Console'; min-height: 14px;"><br>
</p>
<p style="margin: 0px; font-family: 'Lucida Console';">===================================================================================</p>
<p style="margin: 0px; font-family: 'Lucida Console';">= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES</p>
<p style="margin: 0px; font-family: 'Lucida Console';">= EXIT CODE: 1</p>
<p style="margin: 0px; font-family: 'Lucida Console';">= CLEANING UP REMAINING PROCESSES</p>
<p style="margin: 0px; font-family: 'Lucida Console';">= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES</p>
<p style="margin: 0px; font-family: 'Lucida Console';">===================================================================================</p>
<p style="margin: 0px; font-family: 'Lucida Console';">[proxy:0:1@that.machine] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:886): assert (!closed) failed</p>
<p style="margin: 0px; font-family: 'Lucida Console';">[proxy:0:1@that.machine] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status</p>
<p style="margin: 0px; font-family: 'Lucida Console';">[proxy:0:1@that.machine] main (./pm/pmiserv/pmip.c:206): demux engine error waiting for event</p>
<p style="margin: 0px; font-family: 'Lucida Console';">[mpiexec@this.machine] HYDT_bscu_wait_for_completion (./tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated badly; aborting</p>
<p style="margin: 0px; font-family: 'Lucida Console';">[mpiexec@this.machine] HYDT_bsci_wait_for_completion (./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion</p>
<p style="margin: 0px; font-family: 'Lucida Console';">[mpiexec@this.machine] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:217): launcher returned error waiting for completion</p>
<p style="margin: 0px; font-family: 'Lucida Console';"></p>
<p style="margin: 0px; font-family: 'Lucida Console';">[mpiexec@this.machine] main (./ui/mpich/mpiexec.c:331): process manager error waiting for completion</p>
</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif;">
<div>In debug mode it affirms that it is at least *starting with the first available port as listed in MPIEXEC_PORT_RANGE</div>
</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif;">But later we get output like this:</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style="font-size: 14px;">
<p style="font-family: 'Lucida Console'; margin: 0px;">[mpiexec@this.machine] PMI response to fd 6 pid 4: cmd=keyval_cache P0-businesscard=description#{this.machine’s.ip.address}$port#54105$ifname#{this.machine’s.ip.address}$ P1-businesscard=description#{that.machine’s.ip.address}$port#47302$ifname#{that.machine’s.ip.address}$ </p>
<p style="font-family: 'Lucida Console'; margin: 0px;"><br>
</p>
<p style="margin: 0px;">Does this mean that we have missed a firewall setting either in the environment variables or in the ip tables themselves?</p>
<p style="margin: 0px;"><br>
</p>
<p style="margin: 0px;">Ideas?</p>
<p style="margin: 0px;"><br>
</p>
<p style="margin: 0px;"><br>
</p>
<p style="margin: 0px;">Thanks Much</p>
<p style="margin: 0px;">Bill</p>
<p style="margin: 0px;"><br>
</p>
</div>
<div style="font-size: 14px; font-family: Calibri, sans-serif; color: rgb(0, 0, 0);">
<br>
</div>
</body>
</html>