[mpich-discuss] Increasing MPI ranks

Lu, Huiwei huiweilu at mcs.anl.gov
Tue Mar 11 16:41:48 CDT 2014


A quick question before we digging into the script: why are you using different version of MPICH2 and mpiexec?

—
Huiwei

On Mar 11, 2014, at 4:37 PM, Jeffrey Larson <jmlarson at anl.gov> wrote:

> Thank you for your response. Attached are the script and the simple function that it calls. 
> 
> The command:
> $ mpiexec -n 3 python script.py 
> works great, but 30 crashes.
> 
> I am using MPICH2 Version: 1.4.1. and mpiexec version 3.1.
> 
> 
> 
> On Tue, Mar 11, 2014 at 4:31 PM, Lu, Huiwei <huiweilu at mcs.anl.gov> wrote:
> Hi, Jeffrey,
> 
> Can you send us a minimum example of your script.py?
> Simple python scripts seem to work fine for me on both Mac and Ubuntu machines.
> 
> Also, what version of MPICH are you using?
>> Huiwei Lu
> Postdoc Appointee
> Mathematics and Computer Science Division, Argonne National Laboratory
> http://www.mcs.anl.gov/~huiweilu/
> 
> On Mar 11, 2014, at 4:08 PM, Jeffrey Larson <jmlarson at anl.gov> wrote:
> 
> > Hello,
> >
> > I am using mpi4py to call MPI from python, and when I increase the number of ranks at the command line so something like:
> >
> > [jlarson at mintthinkpad test]$ mpiexec -n 30 python script.py
> >
> > I receive the following error messages:
> >
> > [mpiexec at mintthinkpad] handle_pmi_cmd (pm/pmiserv/pmiserv_cb.c:52): Unrecognized PMI command:  | cleaning up processes
> > [mpiexec at mintthinkpad] control_cb (pm/pmiserv/pmiserv_cb.c:280): unable to process PMI command
> > [mpiexec at mintthinkpad] HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:76): callback returned error status
> > [mpiexec at mintthinkpad] HYD_pmci_wait_for_completion (pm/pmiserv/pmiserv_pmci.c:198): error waiting for event
> > [mpiexec at mintthinkpad] main (ui/mpich/mpiexec.c:336): process manager error waiting for completion
> >
> > When I search online for help with these error messages, I am only pointed towards the repository containing the files that are themselves throwing the error. For example: Googling for "handle_pmi_cmd Unrecognized PMI command" only returns:
> >
> > https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.3a1/src/pm/hydra/pm/pmiserv/pmiserv_cb.c
> >
> >
> > Note that my script is using very few resources; each rank is spawning a simple external function. The function takes a vector and then sums all the entries, squares them, and returns the number to each rank. The script works as expected when the number of initial ranks is small.
> >
> > Thank you for your help.
> > Jeffrey Larson
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
> 
> <function.py><script.py>_______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss




More information about the discuss mailing list