[mpich-discuss] Increasing MPI ranks

Jeffrey Larson jmlarson at anl.gov
Wed Mar 12 17:24:39 CDT 2014


I am not calling the cpi.py script directly. The master is spawning those
processes. So I call

$ mpiexec -n 30 python master.py

Then each of the 30 ranks should spawn a cpi.py process. But with the
attached master.py and cpi.py (directly from the mpi4py tutorial), you can
see the errors I get:

[jlarson at mintthinkpad tutorial_example]$ mpiexec -n 30 python master.py
[mpiexec at mintthinkpad] control_cb (pm/pmiserv/pmiserv_cb.c:200): assert
(!closed) failed
[mpiexec at mintthinkpad] HYDT_dmxu_poll_wait_for_event
(tools/demux/demux_poll.c:76): callback returned error status
[mpiexec at mintthinkpad] HYD_pmci_wait_for_completion
(pm/pmiserv/pmiserv_pmci.c:198): error waiting for event
[mpiexec at mintthinkpad] main (ui/mpich/mpiexec.c:336): process manager error
waiting for completion

As was previously stated, this appears to be an mpi4py problem and not a
mpich question.

Since you are curious about the application, I the motivating example
involves the numerical optimization of the output from an expensive
simulation. I do not have access to the simulation code, so my master will
tell the workers where they need to evaluate the expensive simulation. Then
the simulation might itself depend heavily on MPI.

But I welcome your input on the design paradigm to avoid the "sharp edges"

Thank you again,
Jeff


On Wed, Mar 12, 2014 at 5:15 PM, Jed Brown <jed at jedbrown.org> wrote:

> Jeffrey Larson <jmlarson at anl.gov> writes:
>
> > I am trying to have a single master, with a group of workers who are
> > themselves calculating function values. The evaluation of the function
> may
> > itself involve spawning MPI tasks.
>
> How are you running the cpi test then?  I ran it with many spawned
> processes
>
>   MPI.COMM_SELF.Spawn(cmd, None, 30)
>
> and with one spawned process on each of many masters.  Neither crashed
> with current MPICH or Open MPI.  What exactly is needed to reproduce the
> failure you see?
>
>
> I am curious why you want all this process spawning (there are lots of
> sharp edges to deploying this approach), but mpi4py should work.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140312/4bdf5934/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cpi.py
Type: text/x-python
Size: 409 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140312/4bdf5934/attachment.py>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: master.py
Type: text/x-python
Size: 327 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140312/4bdf5934/attachment-0001.py>


More information about the discuss mailing list