[mpich-discuss] Increasing MPI ranks
Jeffrey Larson
jmlarson at anl.gov
Tue Mar 11 16:47:50 CDT 2014
I just updated mpiexec last week because an early version of my script was
pausing for a considerable length of time after spawning 8, 16, 24...
processes. I sent that earlier code to someone else and they said they
didn't observe that behavior and they had mpiexec version 3.1 installed.
So to solve an earlier bug, I installed the latest mpiexec. I only
mentioned the mpich2 version because I saw there was an "mpich2version"
command. It gives the 1.4.1 output.
On Tue, Mar 11, 2014 at 4:37 PM, Jeffrey Larson <jmlarson at anl.gov> wrote:
> Thank you for your response. Attached are the script and the simple
> function that it calls.
>
> The command:
> $ mpiexec -n 3 python script.py
> works great, but 30 crashes.
>
> I am using MPICH2 Version: 1.4.1. and mpiexec version 3.1.
>
>
>
> On Tue, Mar 11, 2014 at 4:31 PM, Lu, Huiwei <huiweilu at mcs.anl.gov> wrote:
>
>> Hi, Jeffrey,
>>
>> Can you send us a minimum example of your script.py?
>> Simple python scripts seem to work fine for me on both Mac and Ubuntu
>> machines.
>>
>> Also, what version of MPICH are you using?
>> --
>> Huiwei Lu
>> Postdoc Appointee
>> Mathematics and Computer Science Division, Argonne National Laboratory
>> http://www.mcs.anl.gov/~huiweilu/
>>
>> On Mar 11, 2014, at 4:08 PM, Jeffrey Larson <jmlarson at anl.gov> wrote:
>>
>> > Hello,
>> >
>> > I am using mpi4py to call MPI from python, and when I increase the
>> number of ranks at the command line so something like:
>> >
>> > [jlarson at mintthinkpad test]$ mpiexec -n 30 python script.py
>> >
>> > I receive the following error messages:
>> >
>> > [mpiexec at mintthinkpad] handle_pmi_cmd (pm/pmiserv/pmiserv_cb.c:52):
>> Unrecognized PMI command: | cleaning up processes
>> > [mpiexec at mintthinkpad] control_cb (pm/pmiserv/pmiserv_cb.c:280):
>> unable to process PMI command
>> > [mpiexec at mintthinkpad] HYDT_dmxu_poll_wait_for_event
>> (tools/demux/demux_poll.c:76): callback returned error status
>> > [mpiexec at mintthinkpad] HYD_pmci_wait_for_completion
>> (pm/pmiserv/pmiserv_pmci.c:198): error waiting for event
>> > [mpiexec at mintthinkpad] main (ui/mpich/mpiexec.c:336): process manager
>> error waiting for completion
>> >
>> > When I search online for help with these error messages, I am only
>> pointed towards the repository containing the files that are themselves
>> throwing the error. For example: Googling for "handle_pmi_cmd Unrecognized
>> PMI command" only returns:
>> >
>> >
>> https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.3a1/src/pm/hydra/pm/pmiserv/pmiserv_cb.c
>> >
>> >
>> > Note that my script is using very few resources; each rank is spawning
>> a simple external function. The function takes a vector and then sums all
>> the entries, squares them, and returns the number to each rank. The script
>> works as expected when the number of initial ranks is small.
>> >
>> > Thank you for your help.
>> > Jeffrey Larson
>> > _______________________________________________
>> > discuss mailing list discuss at mpich.org
>> > To manage subscription options or unsubscribe:
>> > https://lists.mpich.org/mailman/listinfo/discuss
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140311/a1108488/attachment.html>
More information about the discuss
mailing list