[mpich-devel] Hydra fails to launch hello world on 1 proc

Jeff Hammond jhammond at alcf.anl.gov
Wed Apr 10 18:34:54 CDT 2013


I'm using the latest Git trunk build of MPICH with GCC and am unable
to run a 'hello, world' program using mpiexec.

Any clues what the problem is?  I have not seen this problem before,
but this is newly refreshed laptop.  The firewall is active but I
would not have expected Hydra to need to go through the firewall to
launch a serial job.

If there's something wrong with my setup, it would be nice if Hydra
would issue a warning/error instead of handing.



I compiled MPICH like this:
../configure CC=gcc CXX=g++ FC=gfortran F77=gfortran --enable-threads
--enable-f77 --enable-fc --enable-g --with-pm=hydra --enable-rpath
--disable-static --enable-shared --with-device=ch3:nemesis

jeff at goldstone:~/eclipse/OSPRI/mcs.svn/trunk/tests/devices/mpi-pt> mpicc -show
gcc -I/home/jeff/eclipse/MPICH/git/install-gcc/include
-L/home/jeff/eclipse/MPICH/git/install-gcc/lib64 -Wl,-rpath
-Wl,/home/jeff/eclipse/MPICH/git/install-gcc/lib64 -lmpich -lopa -lmpl
-lrt -lpthread

jeff at goldstone:~/eclipse/OSPRI/mcs.svn/trunk/tests/devices/mpi-pt> make
mpicc -g -O0 -Wall -std=gnu99 -DDEBUG -c hello.c -o hello.o
mpicc -g -O0 -Wall -std=gnu99 safemalloc.o hello.o -lm -o hello.x
rm hello.o

jeff at goldstone:~/eclipse/OSPRI/mcs.svn/trunk/tests/devices/mpi-pt>
mpiexec -n 1 ./hello.x
^C[mpiexec at goldstone.mcs.anl.gov] Sending Ctrl-C to processes as requested
[mpiexec at goldstone.mcs.anl.gov] Press Ctrl-C again to force abort
[mpiexec at goldstone.mcs.anl.gov] HYDU_sock_write
(../../../../src/pm/hydra/utils/sock/sock.c:291): write error (Bad
file descriptor)
[mpiexec at goldstone.mcs.anl.gov] HYD_pmcd_pmiserv_send_signal
(../../../../src/pm/hydra/pm/pmiserv/pmiserv_cb.c:170): unable to
write data to proxy
[mpiexec at goldstone.mcs.anl.gov] ui_cmd_cb
(../../../../src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:79): unable to
send signal downstream
[mpiexec at goldstone.mcs.anl.gov] HYDT_dmxu_poll_wait_for_event
(../../../../src/pm/hydra/tools/demux/demux_poll.c:77): callback
returned error status
[mpiexec at goldstone.mcs.anl.gov] HYD_pmci_wait_for_completion
(../../../../src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:197): error
waiting for event
[mpiexec at goldstone.mcs.anl.gov] main
(../../../../src/pm/hydra/ui/mpich/mpiexec.c:331): process manager
error waiting for completion

jeff at goldstone:~/eclipse/OSPRI/mcs.svn/trunk/tests/devices/mpi-pt> ./hello.x
<no errors>

jeff at goldstone:~/eclipse/OSPRI/mcs.svn/trunk/tests/devices/mpi-pt> cat hello.c
#include <stdio.h>
#include <stdlib.h>

#include <mpi.h>

int main(int argc, char * argv[])
    int provided;

    MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
    if (provided!=MPI_THREAD_MULTIPLE)
        MPI_Abort(MPI_COMM_WORLD, 1);

    int rank, size;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);


    return 0;

