[mpich-discuss] Running HPL on RPi cluster, seems like MPI is somehow not configured properly since it work with 1 node but not more

Jeff Squyres (jsquyres) jsquyres at cisco.com
Wed May 27 13:38:44 CDT 2015


This is pretty much the same conversation we're having on the OMPI list.  :-)

(e.g., http://www.open-mpi.org/community/lists/users/2015/05/26959.php)



> On May 27, 2015, at 2:35 PM, Heerdt, Lanze M. <HeerdtLM1 at GCC.EDU> wrote:
> 
> I am sorry but I am not sure how to go about doing that, can you give me a little more guide on how to do it? I honestly only just started working with mpich/python/mpi4py/HPL just before the weekend.
> 
> The compiling with one and running with another might be possible now because I have both openmpi and mpich installed, but when I was first getting the error I only had mpich so that makes me skeptical.
> Also is it possible my Make.rpi file is not correct for my system? I copied over the lines from the guides but I really have no way of verifying their veracity since I really don't know the system very well at all.
> 
> -----Original Message-----
> From: Rajeev Thakur [mailto:thakur at mcs.anl.gov] 
> Sent: Wednesday, May 27, 2015 1:49 PM
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] Running HPL on RPi cluster, seems like MPI is somehow not configured properly since it work with 1 node but not more
> 
> This problem usually happens when you compile a program with the mpicc/mpif77 from one MPI implementation and run it with the mpiexec from another MPI implementation. Or if the mpif.h header file is picked up from some other implementation. Make sure all these are taken from the same implementation. Give full paths if necessary.
> 
> Rajeev
> 
> 
> On May 27, 2015, at 11:33 AM, "Heerdt, Lanze M." <HeerdtLM1 at GCC.EDU>
> wrote:
> 
>> Yes I am sorry I should have added that in the original post. "-n 4 Hostname" prints out the names of my 4 Pis in the cluster and "...-n 16 helloworld.py" prints out as expected as shown in the attached image. So I know that those work correctly. It is just the strangest problem. If I ever specify more than 1 processes it says "you need _ number of processes to run the following tests" even though I have it set to run with said number of processes.
>> 
>> Thank you for getting back to me so quickly. This problem is giving me grey hairs and I am only a college student.
>> 
>> -Lanze
>> 
>> -----Original Message-----
>> From: Kenneth Raffenetti [mailto:raffenet at mcs.anl.gov]
>> Sent: Wednesday, May 27, 2015 8:21 AM
>> To: discuss at mpich.org
>> Subject: Re: [mpich-discuss] Running HPL on RPi cluster, seems like 
>> MPI is somehow not configured properly since it work with 1 node but 
>> not more
>> 
>> Please try a simple test to ensure your mpiexec is working correctly. 
>> Something like:
>> 
>>  mpiexec -machinefile ~/machinefile -n 4 hostname
>> 
>> That should output the hostnames of the machines you are attempting to run on.
>> 
>> Second, you should ensure that the mpiexec and MPI library your program is linked against come from the same distribution. A common error we see is mixing MPICH's mpiexec with Open MPI's library, or vice versa.
>> 
>> Ken
>> 
>> On 05/26/2015 03:32 PM, Heerdt, Lanze M. wrote:
>>> I realize this may be a bit off topic, but since what I am doing 
>>> seems to be a pretty commonly done thing I am hoping to find someone 
>>> who has done it before/can help since I've been at my wits end for so 
>>> long they are calling me Mr. Whittaker.
>>> 
>>> I am trying to run HPL on a Raspberry Pi cluster. I used the 
>>> following guides to get to where I am now:
>>> 
>>> http://www.tinkernut.com/2014/04/make-cluster-computer/
>>> 
>>> http://www.tinkernut.com/2014/05/make-cluster-computer-part-2/
>>> 
>>> https://www.howtoforge.com/tutorial/hpl-high-performance-linpack-benc
>>> h
>>> mark-raspberry-pi/#comments
>>> 
>>> and a bit of:
>>> https://www.raspberrypi.org/forums/viewtopic.php?p=301458#p301458 
>>> when the above guide wasn't working
>>> 
>>> basically when I run: "mpiexec -machinefile ~/machinefile -n 1 xhpl" 
>>> it works just fine
>>> 
>>> but when I run "mpiexec -machinefile ~/machinefile -n 4 xhpl" it 
>>> errors with the attached image. (if I use "mpirun..." I get the exact 
>>> same behavior)
>>> 
>>> [Note: I HAVE changed the HPL.dat to have "2    Ps" and "2    Qs" from 1
>>> and 1 for when I try to run it with 4 processes]
>>> 
>>> This is for a project of mine which I need done by the end of the 
>>> week so if you see this after 5/29 thank you but don't bother 
>>> responding
>>> 
>>> I have hpl-2.1, mpi4py-1.3.1, mpich-3.1, and openmpi-1.8.5 at my 
>>> disposal
>>> 
>>> In the machinefile are the 4 IP addresses of my 4 RPi nodes
>>> 
>>> 10.15.106.107
>>> 
>>> 10.15.101.29
>>> 
>>> 10.15.106.108
>>> 
>>> 10.15.101.30
>>> 
>>> Any other information you need I can easily get to you so please do 
>>> not hesitate to ask. I have nothing else to do but try and get this 
>>> to work :P
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>> 
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>> <Zoop.PNG>_______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss


-- 
Jeff Squyres
jsquyres at cisco.com
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/

_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list