[mpich-discuss] mpich hangs

Syed. Jahanzeb Maqbool Hashmi jahanzeb.maqbool at gmail.com
Thu Jun 27 22:21:40 CDT 2013


My bad, I just found out that there was a duplicate entry like:
weiser1 127.0.1.1
weiser1 192.168.0.101
so i removed teh 127.x.x.x. entry and kept the hostfile contents similar on
both nodes. Now previous error is reduced to this one:

------ START OF OUTPUT -------

....some HPL startup string (no final result)
...skip.....

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 9
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[proxy:0:0 at weiser1] HYD_pmcd_pmip_control_cmd_cb
(./pm/pmiserv/pmip_cb.c:886): assert (!closed) failed
[proxy:0:0 at weiser1] HYDT_dmxu_poll_wait_for_event
(./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:0 at weiser1] main (./pm/pmiserv/pmip.c:206): demux engine error
waiting for event
[mpiexec at weiser1] HYDT_bscu_wait_for_completion
(./tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated
badly; aborting
[mpiexec at weiser1] HYDT_bsci_wait_for_completion
(./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for
completion
[mpiexec at weiser1] HYD_pmci_wait_for_completion
(./pm/pmiserv/pmiserv_pmci.c:217): launcher returned error waiting for
completion
[mpiexec at weiser1] main (./ui/mpich/mpiexec.c:331): process manager error
waiting for completion

------ END OF OUTPUT -------



On Fri, Jun 28, 2013 at 12:12 PM, Pavan Balaji <balaji at mcs.anl.gov> wrote:

>
> On 06/27/2013 10:08 PM, Syed. Jahanzeb Maqbool Hashmi wrote:
>
>> P4-businesscard=description#**weiser2$port#57651$ifname#192.**168.0.102$
>> P5-businesscard=description#**weiser2$port#52622$ifname#192.**168.0.102$
>> P6-businesscard=description#**weiser2$port#55935$ifname#192.**168.0.102$
>> P7-businesscard=description#**weiser2$port#54952$ifname#192.**168.0.102$
>> P0-businesscard=description#**weiser1$port#41958$ifname#127.**0.1.1$
>> P2-businesscard=description#**weiser1$port#35049$ifname#127.**0.1.1$
>> P1-businesscard=description#**weiser1$port#39634$ifname#127.**0.1.1$
>> P3-businesscard=description#**weiser1$port#51802$ifname#127.**0.1.1$
>>
>
> I have two concerns with your output.  Let's start with the first.
>
> Did you look at this question on the FAQ page?
>
> "Is your /etc/hosts file consistent across all nodes? Unless you are using
> an external DNS server, the /etc/hosts file on every machine should contain
> the correct IP information about all hosts in the system."
>
>
>  -- Pavan
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130628/619e1e2b/attachment.html>


More information about the discuss mailing list