[mpich-discuss] double free error after MPI run finished
Antonio J. Peña
apenya at mcs.anl.gov
Tue Nov 19 09:44:24 CST 2013
Thank you very much for reporting. We're glad you have confirmed the fix.
Best,
Antonio
On Tuesday, November 19, 2013 09:36:37 AM John Bray wrote:
> I can confirm that 3.1rc2 fixes my problem
>
> Thanks for your help.
>
> John
>
> On 9 November 2013 08:19, John Bray <jbray at allinea.com> wrote:
> > Happy to validate any fix when its ready, not only for this test case,
> > but with our full DDT test suite.
> >
> > John
> >
> > On 8 November 2013 19:44, Antonio J. Peña <apenya at mcs.anl.gov> wrote:
> >> Hi John,
> >>
> >> You seem to describe a known bug:
> >>
> >> https://trac.mpich.org/projects/mpich/ticket/1932
> >>
> >> We've addressed it and the fix is waiting for review before going to the
> >> master branch. It would be nice if you could validate it once it's in
> >> master. May I contact you then to see if that addresses your problem?
> >>
> >> Antonio
> >>
> >> On Friday, November 08, 2013 05:43:29 PM John Bray wrote:
> >>> I've got some sample code that I'm using to test our debugger DDT. It
> >>> works on openmpi and intelmpi, but with all the MPICH series from
> >>> MPICH2 onwards to 3.1rc1, I get an error after MPI_FINALIZE. Our
> >>> memory debugging library claims its at
> >>>
> >>> SendqFreePool (dbginit.c:386) trying to free previously freed pointer
> >>>
> >>> with a stacktrrace
> >>> auditmpi
> >>>
> >>> pmpi_finalize__
> >>>
> >>> PMPI__finalize
> >>>
> >>> MPIR_Call_finalize_callbacks
> >>>
> >>> sendqFreePool (dbginit.c)
> >>>
> >>> Run from a raw mpirun command outside DDT
> >>>
> >>> *** glibc detected *** ./auditmpibad_f_debug.exe: double free or
> >>> corruption (fasttop): 0x1032a5c8 ***
> >>> *** glibc detected *** ./auditmpibad_f_debug.exe: double free or
> >>> corruption (fasttop): 0x104da5c8 ***
> >>> ======= Backtrace: =========
> >>> ======= Backtrace: =========
> >>> /lib/libc.so.6(+0xfe750e4)[0xf78e50e4]
> >>> /home/jbray/prog/mpich/mpich-3.1rc1/rhel-6-ppc64_ibm/install/lib/libmpic
> >>> h.so .11(+0x1a8b1c)[0xfec8b1c] /lib/libc.so.6(+0xfe750e4)[0xf76c50e4]
> >>> /home/jbray/prog/mpich/mpich-3.1rc1/rhel-6-ppc64_ibm/install/lib/libmpic
> >>> h.so .11(+0x1a8b1c)[0xfec8b1c] ... and so on
> >>>
> >>> I've configured with
> >>>
> >>> ./configure --prefix=$PWD/install --enable-shared --enable-fast=all
> >>> --enable-debuginfo
> >>>
> >>> and compile with
> >>>
> >>> mpif90 -o auditmpibad_f_debug.exe auditmpibad.F90 -O0 -g
> >>>
> >>> Its works with the ALLTOALLV only called once.
> >>>
> >>> Is this a bug or a misunderstanding on my part. The code is
> >>>
> >>> program auditmpi
> >>>
> >>> use mpi, only : &
> >>>
> >>> MPI_COMM_WORLD, &
> >>> MPI_REAL, &
> >>> MPI_STATUS_SIZE, &
> >>> MPI_BSEND_OVERHEAD, &
> >>> MPI_SUM, &
> >>> MPI_UNDEFINED
> >>>
> >>> implicit none
> >>>
> >>> integer, parameter :: repeats=2
> >>>
> >>> integer :: rank, nproc,ierr,rpt
> >>> real :: input(100000),output(100000)
> >>> integer :: status (MPI_STATUS_SIZE)
> >>> integer :: statuses (MPI_STATUS_SIZE,repeats)
> >>> integer :: request
> >>> logical :: odd
> >>> integer, allocatable :: recvcounts(:),sendcounts(:),displs(:)
> >>>
> >>> call MPI_INIT(ierr)
> >>>
> >>> call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
> >>> call MPI_COMM_SIZE(MPI_COMM_WORLD, nproc, ierr)
> >>>
> >>> if (rank == 0) print *,"auditmpi start"
> >>>
> >>> if (mod(rank,2) == 0) then
> >>>
> >>> odd=.false.
> >>>
> >>> else
> >>>
> >>> odd=.true.
> >>>
> >>> end if
> >>>
> >>> allocate (recvcounts(nproc))
> >>> recvcounts(:) = 100000/nproc
> >>>
> >>> allocate (sendcounts(nproc))
> >>> sendcounts(:) = 100000/nproc
> >>>
> >>> allocate (displs(nproc))
> >>> displs(:)=0 ! don't care if we overlap
> >>>
> >>> if ( .NOT. odd) then
> >>>
> >>> call MPI_IRECV(output,size(output),MPI_REAL,rank+1,1,MPI_COMM_WORLD,
> >>>
> >>> request, ierr)
> >>>
> >>> end if
> >>> call MPI_BARRIER(MPI_COMM_WORLD,ierr)
> >>> if (odd) then
> >>>
> >>> call
> >>>
> >>> MPI_IRSEND(input,size(input),MPI_REAL,rank-1,1,MPI_COMM_WORLD,request,ie
> >>> rr)
> >>> end if
> >>>
> >>> call MPI_WAIT(request,status,ierr)
> >>>
> >>> do rpt=1,2
> >>>
> >>> call
> >>>
> >>> MPI_ALLTOALLV(input,sendcounts,displs,MPI_REAL,output,recvcounts,displs,
> >>> MPI
> >>> _REAL,MPI_COMM_WORLD,ierr) end do
> >>>
> >>> call MPI_BARRIER(MPI_COMM_WORLD,ierr)
> >>>
> >>> deallocate (recvcounts)
> >>> deallocate (sendcounts)
> >>> deallocate (displs)
> >>>
> >>> if (rank == 0) print *,"auditmpi finished"
> >>>
> >>> call MPI_FINALIZE(ierr)
> >>>
> >>> end program auditmpi
> >>>
> >>> John
> >>> _______________________________________________
> >>> discuss mailing list discuss at mpich.org
> >>> To manage subscription options or unsubscribe:
> >>> https://lists.mpich.org/mailman/listinfo/discuss
> >>
> >> _______________________________________________
> >> discuss mailing list discuss at mpich.org
> >> To manage subscription options or unsubscribe:
> >> https://lists.mpich.org/mailman/listinfo/discuss
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list