[mpich-discuss] Running HPL on RPi cluster, seems like MPI is somehow not configured properly since it work with 1 node but not more

Junchao Zhang jczhang at mcs.anl.gov
Wed May 27 17:06:33 CDT 2015


Maybe it is a problem of your code, and is irrelevant to MPI
implementations.
For example, you can add printfs in your code to print out size of
MPI_COMM_WORLD. Then, run your code with "mpirun -n 4 ..." on one node
without specifying machinefile.  If it works, then re-run it with
machinefile and see what happens.

--Junchao Zhang

On Wed, May 27, 2015 at 1:35 PM, Heerdt, Lanze M. <HeerdtLM1 at gcc.edu> wrote:

> I am sorry but I am not sure how to go about doing that, can you give me a
> little more guide on how to do it? I honestly only just started working
> with mpich/python/mpi4py/HPL just before the weekend.
>
> The compiling with one and running with another might be possible now
> because I have both openmpi and mpich installed, but when I was first
> getting the error I only had mpich so that makes me skeptical.
> Also is it possible my Make.rpi file is not correct for my system? I
> copied over the lines from the guides but I really have no way of verifying
> their veracity since I really don't know the system very well at all.
>
> -----Original Message-----
> From: Rajeev Thakur [mailto:thakur at mcs.anl.gov]
> Sent: Wednesday, May 27, 2015 1:49 PM
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] Running HPL on RPi cluster, seems like MPI is
> somehow not configured properly since it work with 1 node but not more
>
> This problem usually happens when you compile a program with the
> mpicc/mpif77 from one MPI implementation and run it with the mpiexec from
> another MPI implementation. Or if the mpif.h header file is picked up from
> some other implementation. Make sure all these are taken from the same
> implementation. Give full paths if necessary.
>
> Rajeev
>
>
> On May 27, 2015, at 11:33 AM, "Heerdt, Lanze M." <HeerdtLM1 at GCC.EDU>
>  wrote:
>
> > Yes I am sorry I should have added that in the original post. "-n 4
> Hostname" prints out the names of my 4 Pis in the cluster and "...-n 16
> helloworld.py" prints out as expected as shown in the attached image. So I
> know that those work correctly. It is just the strangest problem. If I ever
> specify more than 1 processes it says "you need _ number of processes to
> run the following tests" even though I have it set to run with said number
> of processes.
> >
> > Thank you for getting back to me so quickly. This problem is giving me
> grey hairs and I am only a college student.
> >
> > -Lanze
> >
> > -----Original Message-----
> > From: Kenneth Raffenetti [mailto:raffenet at mcs.anl.gov]
> > Sent: Wednesday, May 27, 2015 8:21 AM
> > To: discuss at mpich.org
> > Subject: Re: [mpich-discuss] Running HPL on RPi cluster, seems like
> > MPI is somehow not configured properly since it work with 1 node but
> > not more
> >
> > Please try a simple test to ensure your mpiexec is working correctly.
> > Something like:
> >
> >   mpiexec -machinefile ~/machinefile -n 4 hostname
> >
> > That should output the hostnames of the machines you are attempting to
> run on.
> >
> > Second, you should ensure that the mpiexec and MPI library your program
> is linked against come from the same distribution. A common error we see is
> mixing MPICH's mpiexec with Open MPI's library, or vice versa.
> >
> > Ken
> >
> > On 05/26/2015 03:32 PM, Heerdt, Lanze M. wrote:
> >> I realize this may be a bit off topic, but since what I am doing
> >> seems to be a pretty commonly done thing I am hoping to find someone
> >> who has done it before/can help since I've been at my wits end for so
> >> long they are calling me Mr. Whittaker.
> >>
> >> I am trying to run HPL on a Raspberry Pi cluster. I used the
> >> following guides to get to where I am now:
> >>
> >> http://www.tinkernut.com/2014/04/make-cluster-computer/
> >>
> >> http://www.tinkernut.com/2014/05/make-cluster-computer-part-2/
> >>
> >> https://www.howtoforge.com/tutorial/hpl-high-performance-linpack-benc
> >> h
> >> mark-raspberry-pi/#comments
> >>
> >> and a bit of:
> >> https://www.raspberrypi.org/forums/viewtopic.php?p=301458#p301458
> >> when the above guide wasn't working
> >>
> >> basically when I run: "mpiexec -machinefile ~/machinefile -n 1 xhpl"
> >> it works just fine
> >>
> >> but when I run "mpiexec -machinefile ~/machinefile -n 4 xhpl" it
> >> errors with the attached image. (if I use "mpirun..." I get the exact
> >> same behavior)
> >>
> >> [Note: I HAVE changed the HPL.dat to have "2    Ps" and "2    Qs" from 1
> >> and 1 for when I try to run it with 4 processes]
> >>
> >> This is for a project of mine which I need done by the end of the
> >> week so if you see this after 5/29 thank you but don't bother
> >> responding
> >>
> >> I have hpl-2.1, mpi4py-1.3.1, mpich-3.1, and openmpi-1.8.5 at my
> >> disposal
> >>
> >> In the machinefile are the 4 IP addresses of my 4 RPi nodes
> >>
> >> 10.15.106.107
> >>
> >> 10.15.101.29
> >>
> >> 10.15.106.108
> >>
> >> 10.15.101.30
> >>
> >> Any other information you need I can easily get to you so please do
> >> not hesitate to ask. I have nothing else to do but try and get this
> >> to work :P
> >>
> >>
> >>
> >> _______________________________________________
> >> discuss mailing list     discuss at mpich.org
> >> To manage subscription options or unsubscribe:
> >> https://lists.mpich.org/mailman/listinfo/discuss
> >>
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> > <Zoop.PNG>_______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150527/1b62bfdc/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list