[mpich-devel] --program-suffix misbehaiving

Wesley Bland wbland at mcs.anl.gov
Wed Jul 10 14:21:21 CDT 2013


I'm seeing some bad behavior when configuring with --program-suffix (unless I'm misunderstanding how this is supposed to work). When using this, mpich doesn't generate the mpiexec symlinks anymore (or really what should probably mpiexec-[sufffix]). Instead, it skips straight to mpiexec.hydra-[suffix]. This is a bit confusing, but not a big deal.

The bigger problem is that it doesn't correctly call hydra_pmi_proxy to launch the jobs. These executables are also named hydra_pmi_proxy-[suffix], which results in this nastiness:

wbland at bb20:tmp$ mpiexec.hydra-branch -n 2 hostname
[mpiexec at bb20] HYDU_create_process (/home/wbland/Repositories/mpich/src/pm/hydra/utils/launch/launch.c:75): execvp error on file /home/wbland/tools/mpich-branch/bin/hydra_pmi_proxy (No such file or directory)
[mpiexec at bb20] HYD_pmcd_pmiserv_proxy_init_cb (/home/wbland/Repositories/mpich/src/pm/hydra/pm/pmiserv/pmiserv_cb.c:505): assert (!closed) failed
[mpiexec at bb20] HYDT_dmxu_poll_wait_for_event (/home/wbland/Repositories/mpich/src/pm/hydra/tools/demux/demux_poll.c:76): callback returned error status
[mpiexec at bb20] HYD_pmci_wait_for_completion (/home/wbland/Repositories/mpich/src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:198): error waiting for event
[mpiexec at bb20] main (/home/wbland/Repositories/mpich/src/pm/hydra/ui/mpich/mpiexec.c:331): process manager error waiting for completion

Before I file a ticket, I want to make sure that this isn't expected behavior somehow (especially the first part).

Thanks,
Wesley


More information about the devel mailing list