[mpich-discuss] credentials for mpiexec -f machinefile
Jan Balewski
balewski at MIT.EDU
Wed Apr 23 16:26:14 CDT 2014
On Apr 23, 2014, at 5:10 PM, "Antonio J. Peña" <apenya at mcs.anl.gov> wrote:
> No, what I meant was doing "ssh localhost", but now I look at your logs again, it may be a problem of your DNS configuration.
> mpi-966f395e-bbb1-4a20-8cbc-c10081e91244 seems to be unable to resolve the IP address of mpi-be6bebee-55e3-4901-a5bb-637395ba46f6. Take a look at your /etc/hosts files in both hosts.
>
Right, I just figured that out as well. When VM's are launched the get assigned this funny names, not matching the DNS names.
So I have manually adjusted the content of this 3 files (I run on SL6.5)
vi /etc/hostname
vi /etc/sysconfig/network
vi /etc/hosts
so the DNS name , public IP, and hostname --fqd return the same info, e.g.
[cosy11 at oswrk210 mpich-3.1]$ hostname --fqd
oswrk210.lns.mit.edu
Now the command below does not crashes but hangs for many minutes, eventually time-outing.
So looks like I need to open port 60568 , right?
After I shut down fire wall on IP=210 it complain abut port 45895
Next I shut down firewall in IP=207
and it complained abut new port 35167
Finally I shut down firewall between VM (on the openStack controller) and … IT WORKED:
[cosy11 at oswrk210 mpich-3.1]$ mpiexec -f machinefile2 -n 4 ./examples/cpi
Process 0 of 4 is on oswrk210.lns.mit.edu
Process 3 of 4 is on oswrk207.lns.mit.edu
Process 1 of 4 is on oswrk210.lns.mit.edu
Process 2 of 4 is on oswrk207.lns.mit.edu
pi is approximately 3.1415926535899028, Error is 0.0000000000001097
wall clock time = 0.003602
QUESTION:
Can you tell me what ports I need to have open on the master & worker VMs for mpiexec to distribute work on them?
Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1583 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140423/8e61477b/attachment.p7s>
More information about the discuss
mailing list