[mpich-discuss] Code works with -ppn, fails without using MPICH 3.2
Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC]
matthew.thompson at nasa.gov
Tue Sep 5 09:34:51 CDT 2017
All,
I've been evaluating different MPI stacks on our cluster and found that
MPICH 3.2 does really well on some simple little benchmarks. It also
runs Hello World just fine, so I decided to apply it to our climate
model (GEOS).
However, the first time I did that, things went a bit nuts. Essentially:
> (1065) $ mpirun -np 96 ./GEOSgcm.x | & tee withoutppn.log
> srun.slurm: cluster configuration lacks support for cpu binding
> Fatal error in PMPI_Comm_create: Unknown error class, error stack:
> PMPI_Comm_create(564).................: MPI_Comm_create(MPI_COMM_WORLD, group=0x88000000, new_comm=0x106d1740) failed
> PMPI_Comm_create(541).................:
> MPIR_Comm_create_intra(215)...........:
> MPIR_Get_contextid_sparse_group(500)..:
> MPIR_Allreduce_impl(764)..............:
> MPIR_Allreduce_intra(257).............:
> allreduce_intra_or_coll_fn(163).......:
> MPIR_Allreduce_intra(417).............:
> MPIDU_Complete_posted_with_error(1137): Process failed
> MPIR_Allreduce_intra(417).............:
> MPIDU_Complete_posted_with_error(1137): Process failed
> MPIR_Allreduce_intra(268).............:
> MPIR_Bcast_impl(1452).................:
> MPIR_Bcast(1476)......................:
> MPIR_Bcast_intra(1287)................:
> MPIR_Bcast_binomial(310)..............: Failure during collective
(NOTE: The srun.slurm thing is just an error/warning we always get.
Doesn't matter if it's MPT, Open MPI, MVAPICH2, Intel MPI...it happens.)
The thing is, it works just fine at (NX-by-NY) of 1x6 and 2x12, but once
I go to 3x18, boom, collapse. As I am on 28-core nodes, my first thought
was it was due to crossing nodes. But, those benchmarks I ran did just
fine for 192 nodes, so...hmm.
Out of desperation, I finally thought, what if it was the fact that 28
doesn't divide 96 and passed in -ppn and:
> (1068) $ mpirun -ppn 12 -np 96 ./GEOSgcm.x |& tee withppn.log
> srun.slurm: cluster configuration lacks support for cpu binding
>
> In MAPL_Shmem:
> NumCores per Node = 12
> NumNodes in use = 8
> Total PEs = 96
> ...
Starts up just fine! Note that every other MPI stack (MPT, Intel MPI,
MVAPICH2, and Open MPI) handle the non-ppn type job just fine, but it's
possible that they are evenly distributing the processes themselves. And
the "MAPL_Shmem" lines you see are just reporting what the process
structure looks like. I've added some print statements including this:
if (present(CommIn)) then
CommCap = CommIn
else
CommCap = MPI_COMM_WORLD
end if
if (.not.present(CommIn)) then
call mpi_init(status)
VERIFY_(STATUS)
end if
write (*,*) "MPI Initialized."
So, boring, and CommIn is *not* present, so we are using MPI_COMM_WORLD,
and mpi_init is called as one would. Now if I run:
mpirun -np 96 ./GEOSgcm.x | grep 'MPI Init' | wc -l
to count the number initialized, multiple times, I get results like:
40, 56, 56, 45, 68. Never consistent.
So, I'm a bit at a loss. I freely admit I might have built MPICH3
incorrectly. It was essentially my first time. I configured with:
> ./configure --prefix=$SWDEV/MPI/mpich/3.2/intel_17.0.4.196 \
> --disable-wrapper-rpath CC=icc CXX=icpc FC=ifort F77=ifort \
> --enable-fortran=all --enable-cxx | & tee configure.intel_17.0.4.196.log
which might be too vanilla for a SLURM/Infiniband cluster, but, yet, it
works with -ppn. But maybe I need extra options to work at all times?
--with-ibverbs? --with-slurm?
Any ideas on what's happening and what I might have done wrong?
Thanks,
Matt
--
Matt Thompson, SSAI, Sr Scientific Programmer/Analyst
NASA GSFC, Global Modeling and Assimilation Office
Code 610.1, 8800 Greenbelt Rd, Greenbelt, MD 20771
Phone: 301-614-6712 Fax: 301-614-6246
http://science.gsfc.nasa.gov/sed/bio/matthew.thompson
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list