[mpich-discuss] double free error after MPI run finished

Antonio J. Peña apenya at mcs.anl.gov
Tue Nov 19 09:44:24 CST 2013


Thank you very much for reporting. We're glad you have confirmed the fix.

Best,
  Antonio


On Tuesday, November 19, 2013 09:36:37 AM John Bray wrote:
> I can confirm that 3.1rc2 fixes my problem
> 
> Thanks for your help.
> 
> John
> 
> On 9 November 2013 08:19, John Bray <jbray at allinea.com> wrote:
> > Happy to validate any fix when its ready, not only for this test case,
> > but with our full DDT test suite.
> > 
> > John
> > 
> > On 8 November 2013 19:44, Antonio J. Peña <apenya at mcs.anl.gov> wrote:
> >> Hi John,
> >> 
> >> You seem to describe a known bug:
> >> 
> >> https://trac.mpich.org/projects/mpich/ticket/1932
> >> 
> >> We've addressed it and the fix is waiting for review before going to the
> >> master branch. It would be nice if you could validate it once it's in
> >> master. May I contact you then to see if that addresses your problem?
> >> 
> >>   Antonio
> >> 
> >> On Friday, November 08, 2013 05:43:29 PM John Bray wrote:
> >>> I've got some sample code that I'm using to test our debugger DDT. It
> >>> works on openmpi and intelmpi, but with all the MPICH series from
> >>> MPICH2 onwards to 3.1rc1, I get an error after MPI_FINALIZE. Our
> >>> memory debugging library claims its at
> >>> 
> >>> SendqFreePool (dbginit.c:386) trying to free previously freed pointer
> >>> 
> >>> with a stacktrrace
> >>> auditmpi
> >>> 
> >>>   pmpi_finalize__
> >>>   
> >>>     PMPI__finalize
> >>>     
> >>>       MPIR_Call_finalize_callbacks
> >>>       
> >>>         sendqFreePool (dbginit.c)
> >>> 
> >>> Run from a raw mpirun command outside DDT
> >>> 
> >>> *** glibc detected *** ./auditmpibad_f_debug.exe: double free or
> >>> corruption (fasttop): 0x1032a5c8 ***
> >>> *** glibc detected *** ./auditmpibad_f_debug.exe: double free or
> >>> corruption (fasttop): 0x104da5c8 ***
> >>> ======= Backtrace: =========
> >>> ======= Backtrace: =========
> >>> /lib/libc.so.6(+0xfe750e4)[0xf78e50e4]
> >>> /home/jbray/prog/mpich/mpich-3.1rc1/rhel-6-ppc64_ibm/install/lib/libmpic
> >>> h.so .11(+0x1a8b1c)[0xfec8b1c] /lib/libc.so.6(+0xfe750e4)[0xf76c50e4]
> >>> /home/jbray/prog/mpich/mpich-3.1rc1/rhel-6-ppc64_ibm/install/lib/libmpic
> >>> h.so .11(+0x1a8b1c)[0xfec8b1c] ... and so on
> >>> 
> >>> I've configured with
> >>> 
> >>> ./configure --prefix=$PWD/install --enable-shared --enable-fast=all
> >>> --enable-debuginfo
> >>> 
> >>> and compile with
> >>> 
> >>> mpif90 -o auditmpibad_f_debug.exe auditmpibad.F90 -O0 -g
> >>> 
> >>> Its works with the ALLTOALLV only called once.
> >>> 
> >>> Is this a bug or a misunderstanding on my part. The code is
> >>> 
> >>> program auditmpi
> >>> 
> >>>   use mpi, only :       &
> >>>   
> >>>     MPI_COMM_WORLD,     &
> >>>     MPI_REAL,           &
> >>>     MPI_STATUS_SIZE,    &
> >>>     MPI_BSEND_OVERHEAD, &
> >>>     MPI_SUM,            &
> >>>     MPI_UNDEFINED
> >>>   
> >>>   implicit none
> >>>   
> >>>   integer, parameter :: repeats=2
> >>>   
> >>>   integer :: rank, nproc,ierr,rpt
> >>>   real    :: input(100000),output(100000)
> >>>   integer :: status (MPI_STATUS_SIZE)
> >>>   integer :: statuses (MPI_STATUS_SIZE,repeats)
> >>>   integer :: request
> >>>   logical :: odd
> >>>   integer, allocatable :: recvcounts(:),sendcounts(:),displs(:)
> >>>   
> >>>   call MPI_INIT(ierr)
> >>>   
> >>>   call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
> >>>   call MPI_COMM_SIZE(MPI_COMM_WORLD, nproc, ierr)
> >>>   
> >>>   if (rank == 0) print *,"auditmpi start"
> >>>   
> >>>   if (mod(rank,2) == 0) then
> >>>   
> >>>     odd=.false.
> >>>   
> >>>   else
> >>>   
> >>>     odd=.true.
> >>>   
> >>>   end if
> >>>   
> >>>   allocate (recvcounts(nproc))
> >>>   recvcounts(:) = 100000/nproc
> >>>   
> >>>   allocate (sendcounts(nproc))
> >>>   sendcounts(:) = 100000/nproc
> >>>   
> >>>   allocate (displs(nproc))
> >>>   displs(:)=0 ! don't care if we overlap
> >>>   
> >>>   if ( .NOT. odd) then
> >>>   
> >>>     call MPI_IRECV(output,size(output),MPI_REAL,rank+1,1,MPI_COMM_WORLD,
> >>> 
> >>> request, ierr)
> >>> 
> >>>   end if
> >>>   call MPI_BARRIER(MPI_COMM_WORLD,ierr)
> >>>   if (odd) then
> >>>   
> >>>     call
> >>> 
> >>> MPI_IRSEND(input,size(input),MPI_REAL,rank-1,1,MPI_COMM_WORLD,request,ie
> >>> rr)
> >>> end if
> >>> 
> >>>   call MPI_WAIT(request,status,ierr)
> >>>   
> >>>   do rpt=1,2
> >>>   
> >>>     call
> >>> 
> >>> MPI_ALLTOALLV(input,sendcounts,displs,MPI_REAL,output,recvcounts,displs,
> >>> MPI
> >>> _REAL,MPI_COMM_WORLD,ierr) end do
> >>> 
> >>>   call MPI_BARRIER(MPI_COMM_WORLD,ierr)
> >>>   
> >>>   deallocate (recvcounts)
> >>>   deallocate (sendcounts)
> >>>   deallocate (displs)
> >>>   
> >>>   if (rank == 0) print *,"auditmpi finished"
> >>>   
> >>>   call MPI_FINALIZE(ierr)
> >>> 
> >>> end program auditmpi
> >>> 
> >>> John
> >>> _______________________________________________
> >>> discuss mailing list     discuss at mpich.org
> >>> To manage subscription options or unsubscribe:
> >>> https://lists.mpich.org/mailman/listinfo/discuss
> >> 
> >> _______________________________________________
> >> discuss mailing list     discuss at mpich.org
> >> To manage subscription options or unsubscribe:
> >> https://lists.mpich.org/mailman/listinfo/discuss
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss




More information about the discuss mailing list