[mpich-discuss] first attempts with neighbor collectives
Kokron, Daniel S. (GSFC-610.1)[Computer Sciences Corporation]
daniel.s.kokron at nasa.gov
Sat Nov 23 12:03:17 CST 2013
Declaring sdispls and rdispls as integer(kind=MPI_ADDRESS_KIND) does resolve the failure and the output from MPI_Neighbor_alltoallw agrees with the other neighbor routines.
Thank you
Daniel Kokron
NASA Ames (ARC-TN)
SciCon group
301-286-3959
________________________________________
From: discuss-bounces at mpich.org [discuss-bounces at mpich.org] on behalf of Rajeev Thakur [thakur at mcs.anl.gov]
Sent: Saturday, November 23, 2013 12:07 AM
To: discuss at mpich.org
Subject: Re: [mpich-discuss] first attempts with neighbor collectives
In alltoallw, sdispls and rdispls need to be defined as integer (kind=MPI_ADDRESS_KIND).
MPI_Comm_free(cart) needs an ierr.
In Fortran you should use MPI_INTEGER instead of MPI_INT as the datatype.
Rajeev
On Nov 22, 2013, at 2:20 PM, "Kokron, Daniel S. (GSFC-610.1)[Computer Sciences Corporation]" <daniel.s.kokron at nasa.gov> wrote:
> I've started playing with the neighbor collectives in mpich-3.0.4. My first attempt was to convert the existing ~/test/mpi/topo/neighb_coll.c from 1D to 2D. That went fine. Now I want to convert that 2D C code to fortran. All works fine except the call to MPI_Neighbor_alltoallw which fails with a SEGV. Any ideas what I'm going wrong?
>
> My 3.0.4 was configured and compiled with the Intel 13.1.3.192 compiler suite under Linux kernel 2.6.32.54 x86_64.
> ./configure CC=icc CXX=icpc FC=ifort F77=ifort --prefix=~/install/intel-13.1.3.192 --enable-f77 --enable-fc --enable-g=all --enable-debuginfo --enable-shared
>
> The attached reproducer was compiled with the same suite
> mpif90 -g -O0 -traceback -debug -check -o neighb_coll2Df neighb_coll2Df.f90
>
> and run with
> mpirun -np 12 neighb_coll2Df
>
>
> valgrind-3.8.1 had this to say
> ==55631== Conditional jump or move depends on uninitialised value(s)
> ==55631== at 0x62A9F42: vfprintf (in /lib64/libc-2.11.1.so)
> ==55631== by 0x62CD288: vsprintf (in /lib64/libc-2.11.1.so)
> ==55631== by 0x62B2BD7: sprintf (in /lib64/libc-2.11.1.so)
> ==55631== by 0x47288B: stackwalk_cb (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631== by 0x473B94: tbk_trace_stack (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631== by 0x4725E5: tbk_string_stack_signal (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631== by 0x42ADE1: tbk_stack_trace (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631== by 0x40B89A: for__issue_diagnostic (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631== by 0x40F034: for__signal_handler (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631== by 0x5BFD5CF: ??? (in /lib64/libpthread-2.11.1.so)
> ==55631== by 0x4C2B01F: _intel_fast_memcpy (mc_replace_strmem.c:889)
> ==55631== by 0x5138DD5: MPIUI_Memcpy (mpiimpl.h:162)
> ==55631==
> ==55631== Warning: bad signal number 0 in sigaction()
> ==55631== Conditional jump or move depends on uninitialised value(s)
> ==55631== at 0x47263A: tbk_string_stack_signal (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631== by 0x42ADE1: tbk_stack_trace (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631== by 0x40B89A: for__issue_diagnostic (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631== by 0x40F034: for__signal_handler (in /home1/dkokron/play/MPICH3/mpich-3.0.4/test/mpi/topo/neighb_coll2Df)
> ==55631== by 0x5BFD5CF: ??? (in /lib64/libpthread-2.11.1.so)
> ==55631== by 0x4C2B01F: _intel_fast_memcpy (mc_replace_strmem.c:889)
> ==55631== by 0x5138DD5: MPIUI_Memcpy (mpiimpl.h:162)
> ==55631== by 0x5139960: MPID_nem_mpich_sendv_header (mpid_nem_inline.h:307)
> ==55631== by 0x5137847: MPIDI_CH3_iSendv (ch3_isendv.c:74)
> ==55631== by 0x5119825: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:550)
> ==55631== by 0x5123A47: MPID_Isend (mpid_isend.c:131)
> ==55631== by 0x518A974: MPID_Sched_start (mpid_sched.c:155)
>
> Daniel Kokron
> NASA Ames (ARC-TN)
> SciCon group
> 301-286-3959
> <neighb_coll2Df.f90>_______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list