[mpich-discuss] first attempts with neighbor collectives

Rajeev Thakur thakur at mcs.anl.gov
Fri Nov 22 23:07:42 CST 2013


In alltoallw, sdispls and rdispls need to be defined as integer (kind=MPI_ADDRESS_KIND).

MPI_Comm_free(cart) needs an ierr.

In Fortran you should use MPI_INTEGER instead of MPI_INT as the datatype.

Rajeev


On Nov 22, 2013, at 2:20 PM, "Kokron, Daniel S. (GSFC-610.1)[Computer Sciences Corporation]" <daniel.s.kokron at nasa.gov> wrote:

> I've started playing with the neighbor collectives in mpich-3.0.4.  My first attempt was to convert the existing ~/test/mpi/topo/neighb_coll.c from 1D to 2D.  That went fine.  Now I want to convert that 2D C code to fortran.  All works fine except the call to MPI_Neighbor_alltoallw which fails with a SEGV.  Any ideas what I'm going wrong?
> 
> My 3.0.4 was configured and compiled with the Intel 13.1.3.192 compiler suite under Linux kernel 2.6.32.54 x86_64.
> ./configure CC=icc CXX=icpc FC=ifort F77=ifort --prefix=~/install/intel-13.1.3.192 --enable-f77 --enable-fc --enable-g=all --enable-debuginfo --enable-shared
> 
> The attached reproducer was compiled with the same suite
> mpif90 -g -O0 -traceback -debug -check -o neighb_coll2Df neighb_coll2Df.f90
> 
> and run with
> mpirun -np 12 neighb_coll2Df
> 
> 
> valgrind-3.8.1 had this to say
> ==55631== Conditional jump or move depends on uninitialised value(s)
> ==55631==    at 0x62A9F42: vfprintf (in /lib64/libc-2.11.1.so)
> ==55631==    by 0x62CD288: vsprintf (in /lib64/libc-2.11.1.so)
> ==55631==    by 0x62B2BD7: sprintf (in /lib64/libc-2.11.1.so)
> ==55631==    by 0x47288B: stackwalk_cb (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631==    by 0x473B94: tbk_trace_stack (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631==    by 0x4725E5: tbk_string_stack_signal (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631==    by 0x42ADE1: tbk_stack_trace (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631==    by 0x40B89A: for__issue_diagnostic (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631==    by 0x40F034: for__signal_handler (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631==    by 0x5BFD5CF: ??? (in /lib64/libpthread-2.11.1.so)
> ==55631==    by 0x4C2B01F: _intel_fast_memcpy (mc_replace_strmem.c:889)
> ==55631==    by 0x5138DD5: MPIUI_Memcpy (mpiimpl.h:162)
> ==55631== 
> ==55631== Warning: bad signal number 0 in sigaction()
> ==55631== Conditional jump or move depends on uninitialised value(s)
> ==55631==    at 0x47263A: tbk_string_stack_signal (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631==    by 0x42ADE1: tbk_stack_trace (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631==    by 0x40B89A: for__issue_diagnostic (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631==    by 0x40F034: for__signal_handler (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631==    by 0x5BFD5CF: ??? (in /lib64/libpthread-2.11.1.so)
> ==55631==    by 0x4C2B01F: _intel_fast_memcpy (mc_replace_strmem.c:889)
> ==55631==    by 0x5138DD5: MPIUI_Memcpy (mpiimpl.h:162)
> ==55631==    by 0x5139960: MPID_nem_mpich_sendv_header (mpid_nem_inline.h:307)
> ==55631==    by 0x5137847: MPIDI_CH3_iSendv (ch3_isendv.c:74)
> ==55631==    by 0x5119825: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:550)
> ==55631==    by 0x5123A47: MPID_Isend (mpid_isend.c:131)
> ==55631==    by 0x518A974: MPID_Sched_start (mpid_sched.c:155)
> 
> Daniel Kokron
> NASA Ames (ARC-TN)
> SciCon group
> 301-286-3959
> <neighb_coll2Df.f90>_______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss




More information about the discuss mailing list