[mpich-discuss] Unexpected results

"Antonio J. Peña" apenya at mcs.anl.gov
Thu Apr 2 21:07:33 CDT 2015


That should be fine. Let's start from the basics: does it work without 
the "-f machinefile" parameter? if it does, it means that you have some 
connection problem (e.g.: do you have a firewall installed?); otherwise, 
there's something wrong in the installation (e.g.: multiple MPI 
implementations installed).

BTW, which MPICH version are you using?


On 04/02/2015 08:52 PM, Ken Miller wrote:
> Antonio,
>
> I can ssh without having to enter a password from:
> Pi01 to Pi02
> Pi01 to Pi03
> Pi01 to Pi04
>
> Also from each Pi back to Pi01
>
> However, I'm asked for a password from Pi04 to Pi03 and Pi02, etc.
>
> Does that matter since I am running the mpiexec command from Pi01 only?
>
> Thanks.
>
> Ken
>
> Sent from my iPad
>
> On Apr 2, 2015, at 9:29 PM, Antonio J. Peña <apenya at mcs.anl.gov 
> <mailto:apenya at mcs.anl.gov>> wrote:
>
>>
>> Well, I meant to suggest to actually check if that had the expected 
>> effects, so rephrasing what I said: can you check if you can ssh 
>> without being asked for a password to all nodes from all nodes, using 
>> exactly what you have in the host file (i.e.: ip addresses)?
>>
>>
>> On 04/02/2015 08:20 PM, Ken Miller wrote:
>>> Antonio,
>>>
>>> Thanks for the quick response.
>>>
>>> I did do the following on each of the other three Pis:
>>>
>>> On Pi02:
>>> ssh-keygen
>>> cd .ssh
>>> cd id_rsa.pub pi02
>>> scp 192.168.2.121:/home/pi/.ssh/p01 .
>>> cat pi01>>authorized_keys
>>>
>>> and then I copied each from the other three Pis to Pi01 and cat 
>>> those to the authorized_keys on Pi01
>>>
>>> Is that what you are referring to?
>>>
>>> If not, do you mind pointing me in the right direction?
>>>
>>> Thanks in advance.
>>>
>>> Ken
>>>
>>> On Apr 2, 2015, at 9:09 PM, Antonio J. Peña <apenya at mcs.anl.gov 
>>> <mailto:apenya at mcs.anl.gov>> wrote:
>>>
>>>>
>>>> Hi Ken,
>>>>
>>>> Please check that you have password-less ssh connectivity between 
>>>> the nodes.
>>>>
>>>> Best,
>>>>   Antonio
>>>>
>>>>
>>>> On 04/02/2015 07:33 PM, Ken Miller wrote:
>>>>> Hello,
>>>>>
>>>>> I am hoping someone might be able to point out what I am doing 
>>>>> wrong. I have setup a Raspberry Pi cluster by going through a 
>>>>> couple tutorials. The last step is to use MPI to test the cluster. 
>>>>> Everything was going great until I ran the following command:
>>>>>
>>>>> mpiexec -f machinefile -n 4 hostname
>>>>>
>>>>> My machinefile contains the following:
>>>>>
>>>>> *pi at Pi01* *~/mpi_test $* cat machinefile
>>>>> 192.168.2.121
>>>>> 192.168.2.122
>>>>> 192.168.2.123
>>>>> 192.168.2.124
>>>>>
>>>>> But, when I run the mpiexec command, I get the following error:
>>>>>
>>>>> *pi at Pi01* *~/mpi_test $* mpiexec -f machinefile -n 4 hostname
>>>>> Pi01
>>>>> [mpiexec at Pi01] control_cb 
>>>>> (/home/pi/mpich2/mpich-3.1/src/pm/hydra/pm/pmiserv/pmiserv_cb.c:200): 
>>>>> assert (!closed) failed
>>>>> [mpiexec at Pi01] HYDT_dmxu_poll_wait_for_event 
>>>>> (/home/pi/mpich2/mpich-3.1/src/pm/hydra/tools/demux/demux_poll.c:76): 
>>>>> callback returned error status
>>>>> [mpiexec at Pi01] HYD_pmci_wait_for_completion 
>>>>> (/home/pi/mpich2/mpich-3.1/src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:198): 
>>>>> error waiting for event
>>>>> [mpiexec at Pi01] main 
>>>>> (/home/pi/mpich2/mpich-3.1/src/pm/hydra/ui/mpich/mpiexec.c:336): 
>>>>> process manager error waiting for completion
>>>>> *pi at Pi01**~/mpi_test $*
>>>>>
>>>>> Clearly, I am missing something. Would appreciate any help in advance.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Ken
>>>>>
>>>>> ps. I am connected to the other Pis evident by the following ping 
>>>>> results. I can also log into each of the other 3 Pis.
>>>>>
>>>>> ping 192.168.2.122
>>>>> PING 192.168.2.122 (192.168.2.122) 56(84) bytes of data.
>>>>> 64 bytes from 192.168.2.122: icmp_req=1 ttl=64 time=0.807 ms
>>>>> 64 bytes from 192.168.2.122: icmp_req=2 ttl=64 time=0.626 ms
>>>>> 64 bytes from 192.168.2.122: icmp_req=3 ttl=64 time=0.614 ms
>>>>> 64 bytes from 192.168.2.122: icmp_req=4 ttl=64 time=0.605 ms
>>>>> 64 bytes from 192.168.2.122: icmp_req=5 ttl=64 time=0.603 ms
>>>>> ^C
>>>>> --- 192.168.2.122 ping statistics ---
>>>>> 5 packets transmitted, 5 received, 0% packet loss, time 4002ms
>>>>> rtt min/avg/max/mdev = 0.603/0.651/0.807/0.078 ms
>>>>> *pi at Pi01* *~/mpi_test $* ping 192.168.2.123
>>>>> PING 192.168.2.123 (192.168.2.123) 56(84) bytes of data.
>>>>> 64 bytes from 192.168.2.123: icmp_req=1 ttl=64 time=0.794 ms
>>>>> 64 bytes from 192.168.2.123: icmp_req=2 ttl=64 time=0.634 ms
>>>>> 64 bytes from 192.168.2.123: icmp_req=3 ttl=64 time=0.628 ms
>>>>> 64 bytes from 192.168.2.123: icmp_req=4 ttl=64 time=0.607 ms
>>>>> ^C
>>>>> --- 192.168.2.123 ping statistics ---
>>>>> 4 packets transmitted, 4 received, 0% packet loss, time 3003ms
>>>>> rtt min/avg/max/mdev = 0.607/0.665/0.794/0.081 ms
>>>>> *pi at Pi01* *~/mpi_test $* ping 192.168.2.124
>>>>> PING 192.168.2.124 (192.168.2.124) 56(84) bytes of data.
>>>>> 64 bytes from 192.168.2.124: icmp_req=1 ttl=64 time=0.787 ms
>>>>> 64 bytes from 192.168.2.124: icmp_req=2 ttl=64 time=0.632 ms
>>>>> 64 bytes from 192.168.2.124: icmp_req=3 ttl=64 time=0.612 ms
>>>>> ^C
>>>>> --- 192.168.2.124 ping statistics ---
>>>>> 3 packets transmitted, 3 received, 0% packet loss, time 2002ms
>>>>> rtt min/avg/max/mdev = 0.612/0.677/0.787/0.078 ms
>>>>> *pi at Pi01**~/mpi_test $*
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> discuss mailing listdiscuss at mpich.org
>>>>> To manage subscription options or unsubscribe:
>>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>
>>>>
>>>> -- 
>>>> Antonio J. Peña
>>>> Postdoctoral Appointee
>>>> Mathematics and Computer Science Division
>>>> Argonne National Laboratory
>>>> 9700 South Cass Avenue, Bldg. 240, Of. 3148
>>>> Argonne, IL 60439-4847
>>>> apenya at mcs.anl.gov
>>>> www.mcs.anl.gov/~apenya
>>>> _______________________________________________
>>>> discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
>>>> To manage subscription options or unsubscribe:
>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>>
>>>
>>> _______________________________________________
>>> discuss mailing listdiscuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>> -- 
>> Antonio J. Peña
>> Postdoctoral Appointee
>> Mathematics and Computer Science Division
>> Argonne National Laboratory
>> 9700 South Cass Avenue, Bldg. 240, Of. 3148
>> Argonne, IL 60439-4847
>> apenya at mcs.anl.gov
>> www.mcs.anl.gov/~apenya
>> _______________________________________________
>> discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss


-- 
Antonio J. Peña
Postdoctoral Appointee
Mathematics and Computer Science Division
Argonne National Laboratory
9700 South Cass Avenue, Bldg. 240, Of. 3148
Argonne, IL 60439-4847
apenya at mcs.anl.gov
www.mcs.anl.gov/~apenya

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150402/8fbe5cc1/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list