[mpich-discuss] running parallel job issue!

Reuti reuti at staff.uni-marburg.de
Wed Oct 23 09:56:52 CDT 2013


Am 23.10.2013 um 16:46 schrieb Alexandra Betouni:

> Hey there, I am trying to set up a parallel invironment with 14 machines, running Linux XUbuntu, connected via ethernet.
> They all have same IP's and same hostnames.

I assume you mean "same entries for TCP/IP address and hostnames" in /etc/hosts - if all have the same TCP/IP address it won't work.


> Well I started installing mpich-3.0.4 on a single machine, I run the cpi example on localhost by giving mpiexec -host localhost -n 4 ./examples/cpi and everything worked fine!
> So I continued changing the hostnames of 2 pc's for a start, and setting up the ssh in these two, also I installed the mpich-3.0.4 on the other machine too.
> By giving the ssh <othermachine> date commant , I get the date of the other host without giving a password,

Is it really the other host, or just the local machine accessed by a local `ssh` command?


> so I think I passed that step too.
> Next step was to check if the mpich-3.0.4 runs parallel, so  I created a machine file (I made a text file giving the hostnames of the two computers , host1 and host2), and save it in my mpich-3.0.4 build directory. Though when I am trying to parallel run the cpi code by giving mpiexec -n 4 -f machinefile ./examples/cpi on my working directory, I get NO errors but neither parallel job...

Do you share the /home directory or transferred the MPI installation to all machines (like the ~/.ssh/authorized_keys" file)?


> All processes still running on host1 which is my work station. 
> What am I doing wrong?

Whether it's some kind of loopback you can check with:

$ ps -e f

(f w/o -) to check whether all processes of the MPI application are kids of the local shell or via a local `ssh`.

-- Reuti


> Thanks
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss




More information about the discuss mailing list