[mpich-discuss] Parallel test hanging with mpich on rhel7

Orion Poplawski orion at cora.nwra.com
Tue Feb 11 22:26:19 CST 2014


On 02/10/2014 09:56 PM, Balaji, Pavan wrote:
> That’s really weird.  Errno 1 is "permission denied”.  I don’t know how
> that’s happening with gethostbyname.
> 
> Can you send your mpiexec command line and a small program that reproduces
> this error?  E.g., if a program that just does MPI_INIT/MPI_FINALIZE
> reproduces this error, that’ll be best.
> 
>   — Pavan
> 

Reproduced with current nightly and an mpi hello world program:

Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(467)..............:
MPID_Init(177).....................: channel initialization failed
MPIDI_CH3_Init(70).................:
MPID_nem_init(319).................:
MPID_nem_tcp_init(171).............:
MPID_nem_tcp_get_business_card(418):
MPID_nem_tcp_init(377).............: gethostbyname failed, i-0000205b
(errno 1)

For gethostbyname you need to check h_errno - and 1 is HOST_NOT_FOUND,
which is the case for the Fedora builders - they cannot resolve even
their own hostname.

-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA/CoRA Division                    FAX: 303-415-9702
3380 Mitchell Lane                  orion at cora.nwra.com
Boulder, CO 80301              http://www.cora.nwra.com



More information about the discuss mailing list