[mpich-discuss] weird behavior with mpiexe (3.0.4)
Jeff Hammond
jhammond at alcf.anl.gov
Wed May 29 16:02:00 CDT 2013
> This is weird. If I chdir to /home/edscott/tmp and execute the following
> command I get:
>
> edscott at tauro ~/tmp $ mpiexec -n 2 -hosts tauro,velascoj /bin/hostname
> tauro
> velascoj
>
> But if I chdir to /home/edscott/temp and do the same I get:
>
> edscott at tauro ~/temp $ mpiexec -n 2 -hosts tauro,velascoj /bin/hostname
> tauro
> [mpiexec at tauro] control_cb (./pm/pmiserv/pmiserv_cb.c:202): assert (!closed)
> failed
> [mpiexec at tauro] HYDT_dmxu_poll_wait_for_event
> (./tools/demux/demux_poll.c:77): callback returned error status
> [mpiexec at tauro] HYD_pmci_wait_for_completion
> (./pm/pmiserv/pmiserv_pmci.c:197): error waiting for event
> [mpiexec at tauro] main (./ui/mpich/mpiexec.c:331): process manager error
> waiting for completion
>
>
> Wouldn't a message such as "`pwd` directory does not exist on node velascoj"
> be more illustrative?
Yes. However, the set of improper uses of MPI that could generate
helpful error messages is uncountable. Do you not think it is a good
use of finite developer effort to implement an infinitesimal fraction
of such warnings? There has to be a minimum requirement placed upon
the user. I personally think that it should include running in a
directory that actually exists.
Jeff
--
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
ALCF docs: http://www.alcf.anl.gov/user-guides
More information about the discuss
mailing list