[mpich-discuss] Help for installing mpich

Rob Latham robl at mcs.anl.gov
Tue May 6 09:04:10 CDT 2014



On 05/05/2014 04:57 PM, Kenneth Raffenetti wrote:
> Steve,
>
> It looks like the iptables firewall may be active and preventing MPICH
> from communicating between nodes. You should be able to confirm this by
> temporarily disabling the firewalls (sudo /etc/init.d/iptables stop) on
> the involved nodes.

what if steve doesn't have privileges to do so?  He can SSH, so there is 
some ssh tunneling magic he can set up to work around firewalls, right?

alas, replace 'tar' with 'ssh tunneling' in this comic and you have me:

http://www.xkcd.com/1168/

==rob
>
> If that's indeed the case, you may want to permanently disable the
> firewalls or configure them to allow a specific port range for MPICH.
> The wiki has info on how to tell MPICH which port range to use.
>
> http://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_How_do_I_control_which_ports_MPICH_uses.3F
>
>
> Ken
>
> On 05/05/2014 04:39 PM, Rusty Lusk wrote:
>>
>>
>> Hi, can someone help Steve with this problem?
>>
>> Steve, I am forwarding this to mpich-discuss, which is our
>> problem-reporting list.  It will probably go first to the excellent Ken
>> Raffenetti, who I predict will be able to help.
>>
>> Rusty
>>
>>
>> Begin forwarded message:
>>
>>> *From: *"Steven C. Pieper" <spieper at anl.gov <mailto:spieper at anl.gov>>
>>> *Subject: **Help for installing mpich*
>>> *Date: *May 5, 2014 at 4:25:13 PM CDT
>>> *To: *Ralph Butler <rbutler at mtsu.edu <mailto:rbutler at mtsu.edu>>, Rusty
>>> Lusk <lusk at mcs.anl.gov <mailto:lusk at mcs.anl.gov>>
>>> *Reply-To: *<spieper at anl.gov <mailto:spieper at anl.gov>>
>>>
>>> I'm trying to build mpich on the theory linux cluster which
>>> is scientific Linux 6.2 which is basically Redhat Enterprise 6
>>>
>>> I got mpich-3.1 from the download site, did the configure,
>>> make, and install (into ~/mpich-install) .  This all
>>> seemed OK.  I made a machine file:
>>>
>>> theoryl14.phy.anl.gov <http://theoryl14.phy.anl.gov>:2   # 2 processes
>>> on l14
>>> theoryl18.phy.anl.gov <http://theoryl18.phy.anl.gov>:3   # 3 processes
>>> on l18
>>>
>>> If I am on theoryl14, I can run the hostname test on just l14:
>>>
>>> [theoryl14 mpich-test] mpiexec -f machfile_l14 -n 2 hostname
>>> theoryl14.phy.anl.gov <http://theoryl14.phy.anl.gov>
>>> theoryl14.phy.anl.gov <http://theoryl14.phy.anl.gov>
>>>
>>> Similarly, if I put the l18 line first, I can run 3 processes
>>> on l18 from l18.
>>>
>>> However I can't run from l14 to l18 or vice versa:
>>> Here I start on l14 and specify 3 processes so it has to do
>>> one on l18:
>>>
>>> [theoryl14 mpich-test] mpiexec -f machfile_l14 -n 3 hostname
>>> theoryl14.phy.anl.gov <http://theoryl14.phy.anl.gov>
>>> theoryl14.phy.anl.gov <http://theoryl14.phy.anl.gov>
>>> [proxy:0:1 at theoryl18.phy.anl.gov] HYDU_sock_connect
>>> (/home/pieper/mpich-3.1/src/pm/hydra/utils/sock/sock.c:172): unable to
>>> connect from "theoryl18.phy.anl.gov <http://theoryl18.phy.anl.gov>" to
>>> "theoryl14.phy.anl.gov <http://theoryl14.phy.anl.gov>" (No route to
>>> host)
>>> [proxy:0:1 at theoryl18.phy.anl.gov] main
>>> (/home/pieper/mpich-3.1/src/pm/hydra/pm/pmiserv/pmip.c:189): unable to
>>> connect to server theoryl14.phy.anl.gov
>>> <http://theoryl14.phy.anl.gov> at port 48451 (check for firewalls!)
>>> ---I hit return here
>>> [mpiexec at theoryl14.phy.anl.gov] HYDU_sock_write
>>> (/home/pieper/mpich-3.1/src/pm/hydra/utils/sock/sock.c:286): write
>>> error (Bad file descriptor)
>>> [mpiexec at theoryl14.phy.anl.gov] HYDU_sock_write
>>> (/home/pieper/mpich-3.1/src/pm/hydra/utils/sock/sock.c:286): write
>>> error (Bad file descriptor)
>>> ---I use ctrl-c.
>>>
>>> I can ssh without passwords:
>>>
>>> [theoryl14 mpich-test] ssh theoryl18.phy.anl.gov
>>> <http://theoryl18.phy.anl.gov> hostname
>>> theoryl18.phy.anl.gov <http://theoryl18.phy.anl.gov>
>>> [theoryl14 mpich-test]
>>>
>>> This ssh went through a tunnel that was already established.
>>>
>>> So I'm confused.  Can one of you help?
>>>
>>> BTW:  Scientific Linux will allow me to download openmpi but not mpich.
>>> I guess I'd like to try to use mpich.
>>>
>>> Thanks,
>>> Steve
>>>
>>> --
>>> Steven C. Pieper:spieper at anl.gov
>>> Argonne National Laboratory, Physics Division, Bldg. 203, Argonne, IL
>>> 60439
>>> Phone:  630-252-4232         Fax -6008
>>> Secretary, Debra Morrison, -4100
>>
>>
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA



More information about the discuss mailing list