[mpich-discuss] Unexpected results

"Antonio J. Peña" apenya at mcs.anl.gov
Thu Apr 2 20:29:11 CDT 2015


Well, I meant to suggest to actually check if that had the expected 
effects, so rephrasing what I said: can you check if you can ssh without 
being asked for a password to all nodes from all nodes, using exactly 
what you have in the host file (i.e.: ip addresses)?


On 04/02/2015 08:20 PM, Ken Miller wrote:
> Antonio,
>
> Thanks for the quick response.
>
> I did do the following on each of the other three Pis:
>
> On Pi02:
> ssh-keygen
> cd .ssh
> cd id_rsa.pub pi02
> scp 192.168.2.121:/home/pi/.ssh/p01 .
> cat pi01>>authorized_keys
>
> and then I copied each from the other three Pis to Pi01 and cat those 
> to the authorized_keys on Pi01
>
> Is that what you are referring to?
>
> If not, do you mind pointing me in the right direction?
>
> Thanks in advance.
>
> Ken
>
> On Apr 2, 2015, at 9:09 PM, Antonio J. Peña <apenya at mcs.anl.gov 
> <mailto:apenya at mcs.anl.gov>> wrote:
>
>>
>> Hi Ken,
>>
>> Please check that you have password-less ssh connectivity between the 
>> nodes.
>>
>> Best,
>>   Antonio
>>
>>
>> On 04/02/2015 07:33 PM, Ken Miller wrote:
>>> Hello,
>>>
>>> I am hoping someone might be able to point out what I am doing 
>>> wrong. I have setup a Raspberry Pi cluster by going through a couple 
>>> tutorials. The last step is to use MPI to test the cluster. 
>>> Everything was going great until I ran the following command:
>>>
>>> mpiexec -f machinefile -n 4 hostname
>>>
>>> My machinefile contains the following:
>>>
>>> *pi at Pi01* *~/mpi_test $* cat machinefile
>>> 192.168.2.121
>>> 192.168.2.122
>>> 192.168.2.123
>>> 192.168.2.124
>>>
>>> But, when I run the mpiexec command, I get the following error:
>>>
>>> *pi at Pi01* *~/mpi_test $* mpiexec -f machinefile -n 4 hostname
>>> Pi01
>>> [mpiexec at Pi01] control_cb 
>>> (/home/pi/mpich2/mpich-3.1/src/pm/hydra/pm/pmiserv/pmiserv_cb.c:200): assert 
>>> (!closed) failed
>>> [mpiexec at Pi01] HYDT_dmxu_poll_wait_for_event 
>>> (/home/pi/mpich2/mpich-3.1/src/pm/hydra/tools/demux/demux_poll.c:76): callback 
>>> returned error status
>>> [mpiexec at Pi01] HYD_pmci_wait_for_completion 
>>> (/home/pi/mpich2/mpich-3.1/src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:198): 
>>> error waiting for event
>>> [mpiexec at Pi01] main 
>>> (/home/pi/mpich2/mpich-3.1/src/pm/hydra/ui/mpich/mpiexec.c:336): 
>>> process manager error waiting for completion
>>> *pi at Pi01**~/mpi_test $*
>>>
>>> Clearly, I am missing something. Would appreciate any help in advance.
>>>
>>> Thanks.
>>>
>>> Ken
>>>
>>> ps. I am connected to the other Pis evident by the following ping 
>>> results. I can also log into each of the other 3 Pis.
>>>
>>> ping 192.168.2.122
>>> PING 192.168.2.122 (192.168.2.122) 56(84) bytes of data.
>>> 64 bytes from 192.168.2.122: icmp_req=1 ttl=64 time=0.807 ms
>>> 64 bytes from 192.168.2.122: icmp_req=2 ttl=64 time=0.626 ms
>>> 64 bytes from 192.168.2.122: icmp_req=3 ttl=64 time=0.614 ms
>>> 64 bytes from 192.168.2.122: icmp_req=4 ttl=64 time=0.605 ms
>>> 64 bytes from 192.168.2.122: icmp_req=5 ttl=64 time=0.603 ms
>>> ^C
>>> --- 192.168.2.122 ping statistics ---
>>> 5 packets transmitted, 5 received, 0% packet loss, time 4002ms
>>> rtt min/avg/max/mdev = 0.603/0.651/0.807/0.078 ms
>>> *pi at Pi01* *~/mpi_test $* ping 192.168.2.123
>>> PING 192.168.2.123 (192.168.2.123) 56(84) bytes of data.
>>> 64 bytes from 192.168.2.123: icmp_req=1 ttl=64 time=0.794 ms
>>> 64 bytes from 192.168.2.123: icmp_req=2 ttl=64 time=0.634 ms
>>> 64 bytes from 192.168.2.123: icmp_req=3 ttl=64 time=0.628 ms
>>> 64 bytes from 192.168.2.123: icmp_req=4 ttl=64 time=0.607 ms
>>> ^C
>>> --- 192.168.2.123 ping statistics ---
>>> 4 packets transmitted, 4 received, 0% packet loss, time 3003ms
>>> rtt min/avg/max/mdev = 0.607/0.665/0.794/0.081 ms
>>> *pi at Pi01* *~/mpi_test $* ping 192.168.2.124
>>> PING 192.168.2.124 (192.168.2.124) 56(84) bytes of data.
>>> 64 bytes from 192.168.2.124: icmp_req=1 ttl=64 time=0.787 ms
>>> 64 bytes from 192.168.2.124: icmp_req=2 ttl=64 time=0.632 ms
>>> 64 bytes from 192.168.2.124: icmp_req=3 ttl=64 time=0.612 ms
>>> ^C
>>> --- 192.168.2.124 ping statistics ---
>>> 3 packets transmitted, 3 received, 0% packet loss, time 2002ms
>>> rtt min/avg/max/mdev = 0.612/0.677/0.787/0.078 ms
>>> *pi at Pi01**~/mpi_test $*
>>>
>>>
>>> _______________________________________________
>>> discuss mailing listdiscuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>> -- 
>> Antonio J. Peña
>> Postdoctoral Appointee
>> Mathematics and Computer Science Division
>> Argonne National Laboratory
>> 9700 South Cass Avenue, Bldg. 240, Of. 3148
>> Argonne, IL 60439-4847
>> apenya at mcs.anl.gov
>> www.mcs.anl.gov/~apenya
>> _______________________________________________
>> discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss


-- 
Antonio J. Peña
Postdoctoral Appointee
Mathematics and Computer Science Division
Argonne National Laboratory
9700 South Cass Avenue, Bldg. 240, Of. 3148
Argonne, IL 60439-4847
apenya at mcs.anl.gov
www.mcs.anl.gov/~apenya

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150402/291b35e1/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list