[mpich-discuss] mpiexec error

Mccall, Kurt E. (MSFC-EV41) kurt.e.mccall at nasa.gov
Fri May 6 11:55:28 CDT 2022


Running MPICH 4.0.1 under Torque 5.1, I'm getting the mpiexec error "user specified host not in the PBS allocated list".   My qsub command is:

qsub -V -j oe -e stdio -o stdio -f -X -l nodes=21:ppn=20  <bash_script>


My mpiexec command is:

mpiexec -print-all-exitcodes -enable-x -np 21  -wdir ${work_dir} -env DISPLAY localhost:10.0 --ppn 1  <more args> ...


Here is the full error message.   Thanks for any help.

[mpiexec at n022.cluster.com] find_pbs_node_id (../../../../mpich-4.0.1/src/pm/hydra/tools/bootstrap/external/pbs_launch.c:27): user specified host not in the PBS allocated list
[mpiexec at n022.cluster.com] HYDT_bscd_pbs_launch_procs (../../../../mpich-4.0.1/src/pm/hydra/tools/bootstrap/external/pbs_launch.c:74): error finding PBS node ID for host n022
[mpiexec at n022.cluster.com] HYDT_bsci_launch_procs (../../../../mpich-4.0.1/src/pm/hydra/tools/bootstrap/src/bsci_launch.c:17): launcher returned error while launching processes
[mpiexec at n022.cluster.com] fn_spawn (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmiserv_pmi_v1.c:580): launcher cannot launch processes
[mpiexec at n022.cluster.com] handle_pmi_cmd (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmiserv_cb.c:48): PMI handler returned error
[mpiexec at n022.cluster.com] control_cb (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmiserv_cb.c:284): unable to process PMI command
[mpiexec at n022.cluster.com] HYDT_dmxu_poll_wait_for_event (../../../../mpich-4.0.1/src/pm/hydra/tools/demux/demux_poll.c:76): callback returned error status
[mpiexec at n022.cluster.com] HYD_pmci_wait_for_completion (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:160): error waiting for event
[mpiexec at n022.cluster.com] main (../../../../mpich-4.0.1/src/pm/hydra/ui/mpich/mpiexec.c:325): process manager error waiting for completion
[proxy:0:0 at n022.cluster.com] HYD_pmcd_pmip_control_cmd_cb (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:899): assert (!closed) failed
[proxy:0:0 at n022.cluster.com] HYDT_dmxu_poll_wait_for_event (../../../../mpich-4.0.1/src/pm/hydra/tools/demux/[proxy:0:2 at n020.cluster.com] HYD_pmcd_pmip_control_cmd_cb (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:899): assert (!closed) failed
[proxy:0:2 at n020.cluster.com] HYDT_dmxu_poll_wait_for_event (../../../../mpich-4.0.1/src/pm/hydra/tools/demux/[proxy:0:5 at n016.cluster.com] HYD_pmcd_pmip_control_cmd_cb (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:899): assert (!closed) failed
[proxy:0:5 at n016.cluster.com] HYDT_dmxu_poll_wait_for_event (../../../../mpich-4.0.1/src/pm/hydra/tools/demux/[proxy:0:15 at n006.cluster.com] HYD_pmcd_pmip_control_cmd_cb (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:899): assert (!closed) failed
[proxy:0:15 at n006.cluster.com] HYDT_dmxu_poll_wait_for_event (../../../../mpich-4.0.1/src/pm/hydra/tools/demu[proxy:0:16 at n005.cluster.com] HYD_pmcd_pmip_control_cmd_cb (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:899): assert (!closed) failed
[proxy:0:16 at n005.cluster.com] HYDT_dmxu_poll_wait_for_event (../../../../mpich-4.0.1/src/pm/hydra/tools/demu[proxy:0:19 at n002.cluster.com] HYD_pmcd_pmip_control_cmd_cb (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:899): assert (!closed) failed
[proxy:0:19 at n002.cluster.com] HYDT_dmxu_poll_wait_for_event (../../../../mpich-4.0.1/src/pm/hydra/tools/demu[proxy:0:20 at n001.cluster.com] HYD_pmcd_pmip_control_cmd_cb (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:899): assert (!closed) failed
[proxy:0:20 at n001.cluster.com] HYDT_dmxu_poll_wait_for_event (../../../../mpich-4.0.1/src/pm/hydra/tools/demudemux_poll.c:76): callback returned error status
[proxy:0:0 at n022.cluster.com] main (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmip.c:169): demux engine error waiting for event
demux_poll.c:76): callback returned error status
[proxy:0:2 at n020.cluster.com] main (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmip.c:169): demux engine error waiting for event
[proxy:0:1 at n021.cluster.com] HYD_pmcd_pmip_control_cmd_cb (../../../../mpich-4.0.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:899): assert (!closed) failed


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20220506/2d07a1b7/attachment.html>


More information about the discuss mailing list