[mpich-discuss] Increasing MPI ranks

Jeffrey Larson jmlarson at anl.gov
Tue Mar 11 20:40:41 CDT 2014


Thank you both for your time. I respond to each question below:

Huiwei Lu, the outputs from the two commands are below:

[jlarson at mintthinkpad test]$ mpiexec --version
HYDRA build details:
    Version:                                 3.1
    Release Date:                            Thu Feb 20 11:41:13 CST 2014
    CC:                              gcc
    CXX:                             g++
    F77:                             gfortran
    F90:                             gfortran
    Configure options:                       '--disable-option-checking'
'--prefix=/home/jlarson/software/mpich-install' '--cache-file=/dev/null'
'--srcdir=.' 'CC=gcc' 'CFLAGS= -O2' 'LDFLAGS= ' 'LIBS=-lrt -lpthread '
'CPPFLAGS= -I/home/jlarson/software/mpich-3.1/src/mpl/include
-I/home/jlarson/software/mpich-3.1/src/mpl/include
-I/home/jlarson/software/mpich-3.1/src/openpa/src
-I/home/jlarson/software/mpich-3.1/src/openpa/src
-I/home/jlarson/software/mpich-3.1/src/mpi/romio/include'
    Process Manager:                         pmi
    Launchers available:                     ssh rsh fork slurm ll lsf sge
manual persist
    Topology libraries available:            hwloc
     Resource management kernels available:   user slurm ll lsf sge pbs
cobalt
    Checkpointing libraries available:       blcr
    Demux engines available:                 poll select
[jlarson at mintthinkpad test]$ mpicc -v
mpicc for MPICH version 3.1
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro
4.6.3-1ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.6 --enable-shared --enable-linker-build-id
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--enable-gnu-unique-object --enable-plugin --enable-objc-gc
--disable-werror --with-arch-32=i686 --with-tune=generic
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)

Jed Brown,

I don't know how else to have the function communicate with the worker that
spawned it. If I replace the lines:
----
comm_to_function_wrapper = MPI.Comm.Get_parent()
rank = comm_to_function_wrapper.Get_rank()
----
with:
----
comm_to_function_wrapper = MPI.COMM_WORLD.Get_parent()
rank = comm_to_function_wrapper.Get_rank()
----
then I still get valid output for 3 ranks, but not 30. Is there a valid way
for a spawned task to communicate with the worker that spawned it? I was
using the example from the mpi4py manual:
http://mpi4py.scipy.org/docs/mpi4py.pdf
(See the lines directly below "Worker (or child, or server) side:" in
Section 4.3: Dynamic Process Management)

Thank you again!
Jeff




On Tue, Mar 11, 2014 at 5:52 PM, Jed Brown <jed at jedbrown.org> wrote:

> Jeffrey Larson <jmlarson at anl.gov> writes:
>
> > Thank you for your response. Attached are the script and the simple
> > function that it calls.
> >
> > The command:
> > $ mpiexec -n 3 python script.py
> > works great, but 30 crashes.
>
> I don't think this is valid:
>
> |  comm_to_function_wrapper = MPI.Comm.Get_parent()
> |  rank = comm_to_function_wrapper.Get_rank()
>
> $ make intercomm CC=/opt/mpich/bin/mpicc && /opt/mpich/bin/mpiexec -n 2
> ./intercomm
> /opt/mpich/bin/mpicc     intercomm.c   -o intercomm
> Fatal error in PMPI_Comm_rank: Invalid communicator, error stack:
> PMPI_Comm_rank(108): MPI_Comm_rank(MPI_COMM_NULL, rank=0x7fff9fe6a118)
> failed
> PMPI_Comm_rank(66).: Null communicator
> Fatal error in PMPI_Comm_rank: Invalid communicator, error stack:
> PMPI_Comm_rank(108): MPI_Comm_rank(MPI_COMM_NULL, rank=0x7fff5e484d48)
> failed
> PMPI_Comm_rank(66).: Null communicator
>
>
> ===================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   PID 8474 RUNNING AT batura
> =   EXIT CODE: 1
> =   CLEANING UP REMAINING PROCESSES
> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>
> ===================================================================================
> $ /opt/mpich/bin/mpichversion
> MPICH Version:          3.1
> MPICH Release date:     Thu Feb 20 11:41:13 CST 2014
> MPICH Device:           ch3:nemesis
> MPICH configure:        --prefix=/opt/mpich --enable-shared
> --enable-sharedlibs=gcc --enable-error-checking=runtime
> --enable-error-messages=all --enable-timer-type=clock_gettime
> --with-python=python2
> MPICH CC:       gcc  -march=x86-64 -mtune=generic -O2 -pipe
> -fstack-protector --param=ssp-buffer-size=4  -O2
> MPICH CXX:      g++  -march=x86-64 -mtune=generic -O2 -pipe
> -fstack-protector --param=ssp-buffer-size=4
> MPICH F77:      gfortran   -O2
> MPICH FC:       gfortran
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140311/346058a3/attachment.html>


More information about the discuss mailing list