[mpich-discuss] Help for installing mpich

Rusty Lusk lusk at mcs.anl.gov
Mon May 5 16:39:58 CDT 2014



Hi, can someone help Steve with this problem?

Steve, I am forwarding this to mpich-discuss, which is our problem-reporting list.  It will probably go first to the excellent Ken Raffenetti, who I predict will be able to help.

Rusty


Begin forwarded message:

> From: "Steven C. Pieper" <spieper at anl.gov>
> Subject: Help for installing mpich
> Date: May 5, 2014 at 4:25:13 PM CDT
> To: Ralph Butler <rbutler at mtsu.edu>, Rusty Lusk <lusk at mcs.anl.gov>
> Reply-To: <spieper at anl.gov>
> 
> I'm trying to build mpich on the theory linux cluster which
> is scientific Linux 6.2 which is basically Redhat Enterprise 6
> 
> I got mpich-3.1 from the download site, did the configure, 
> make, and install (into ~/mpich-install) .  This all
> seemed OK.  I made a machine file:
> 
> theoryl14.phy.anl.gov:2   # 2 processes on l14
> theoryl18.phy.anl.gov:3   # 3 processes on l18
> 
> If I am on theoryl14, I can run the hostname test on just l14:
> 
> [theoryl14 mpich-test] mpiexec -f machfile_l14 -n 2 hostname
> theoryl14.phy.anl.gov
> theoryl14.phy.anl.gov
> 
> Similarly, if I put the l18 line first, I can run 3 processes
> on l18 from l18.
> 
> However I can't run from l14 to l18 or vice versa:
> Here I start on l14 and specify 3 processes so it has to do
> one on l18:
> 
> [theoryl14 mpich-test] mpiexec -f machfile_l14 -n 3 hostname
> theoryl14.phy.anl.gov
> theoryl14.phy.anl.gov
> [proxy:0:1 at theoryl18.phy.anl.gov] HYDU_sock_connect (/home/pieper/mpich-3.1/src/pm/hydra/utils/sock/sock.c:172): unable to connect from "theoryl18.phy.anl.gov" to "theoryl14.phy.anl.gov" (No route to host)
> [proxy:0:1 at theoryl18.phy.anl.gov] main (/home/pieper/mpich-3.1/src/pm/hydra/pm/pmiserv/pmip.c:189): unable to connect to server theoryl14.phy.anl.gov at port 48451 (check for firewalls!)
> ---I hit return here
> [mpiexec at theoryl14.phy.anl.gov] HYDU_sock_write (/home/pieper/mpich-3.1/src/pm/hydra/utils/sock/sock.c:286): write error (Bad file descriptor)
> [mpiexec at theoryl14.phy.anl.gov] HYDU_sock_write (/home/pieper/mpich-3.1/src/pm/hydra/utils/sock/sock.c:286): write error (Bad file descriptor)
> ---I use ctrl-c.
> 
> I can ssh without passwords:
> 
> [theoryl14 mpich-test] ssh theoryl18.phy.anl.gov hostname
> theoryl18.phy.anl.gov
> [theoryl14 mpich-test] 
> 
> This ssh went through a tunnel that was already established.
> 
> So I'm confused.  Can one of you help?
> 
> BTW:  Scientific Linux will allow me to download openmpi but not mpich.
> I guess I'd like to try to use mpich.
> 
> Thanks,
> Steve
> 
> -- 
> Steven C. Pieper:  spieper at anl.gov 
> Argonne National Laboratory, Physics Division, Bldg. 203, Argonne, IL 60439
> Phone:  630-252-4232         Fax -6008
> Secretary, Debra Morrison, -4100


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140505/21dfe4c4/attachment.html>


More information about the discuss mailing list