[mpich-discuss] double free error after MPI run finished
John Bray
jbray at allinea.com
Tue Nov 19 03:36:37 CST 2013
I can confirm that 3.1rc2 fixes my problem
Thanks for your help.
John
On 9 November 2013 08:19, John Bray <jbray at allinea.com> wrote:
> Happy to validate any fix when its ready, not only for this test case,
> but with our full DDT test suite.
>
> John
>
> On 8 November 2013 19:44, Antonio J. Peña <apenya at mcs.anl.gov> wrote:
>>
>> Hi John,
>>
>> You seem to describe a known bug:
>>
>> https://trac.mpich.org/projects/mpich/ticket/1932
>>
>> We've addressed it and the fix is waiting for review before going to the master
>> branch. It would be nice if you could validate it once it's in master. May I
>> contact you then to see if that addresses your problem?
>>
>> Antonio
>>
>>
>> On Friday, November 08, 2013 05:43:29 PM John Bray wrote:
>>> I've got some sample code that I'm using to test our debugger DDT. It
>>> works on openmpi and intelmpi, but with all the MPICH series from
>>> MPICH2 onwards to 3.1rc1, I get an error after MPI_FINALIZE. Our
>>> memory debugging library claims its at
>>>
>>> SendqFreePool (dbginit.c:386) trying to free previously freed pointer
>>>
>>> with a stacktrrace
>>> auditmpi
>>> pmpi_finalize__
>>> PMPI__finalize
>>> MPIR_Call_finalize_callbacks
>>> sendqFreePool (dbginit.c)
>>>
>>> Run from a raw mpirun command outside DDT
>>>
>>> *** glibc detected *** ./auditmpibad_f_debug.exe: double free or
>>> corruption (fasttop): 0x1032a5c8 ***
>>> *** glibc detected *** ./auditmpibad_f_debug.exe: double free or
>>> corruption (fasttop): 0x104da5c8 ***
>>> ======= Backtrace: =========
>>> ======= Backtrace: =========
>>> /lib/libc.so.6(+0xfe750e4)[0xf78e50e4]
>>> /home/jbray/prog/mpich/mpich-3.1rc1/rhel-6-ppc64_ibm/install/lib/libmpich.so
>>> .11(+0x1a8b1c)[0xfec8b1c] /lib/libc.so.6(+0xfe750e4)[0xf76c50e4]
>>> /home/jbray/prog/mpich/mpich-3.1rc1/rhel-6-ppc64_ibm/install/lib/libmpich.so
>>> .11(+0x1a8b1c)[0xfec8b1c] ... and so on
>>>
>>> I've configured with
>>>
>>> ./configure --prefix=$PWD/install --enable-shared --enable-fast=all
>>> --enable-debuginfo
>>>
>>> and compile with
>>>
>>> mpif90 -o auditmpibad_f_debug.exe auditmpibad.F90 -O0 -g
>>>
>>> Its works with the ALLTOALLV only called once.
>>>
>>> Is this a bug or a misunderstanding on my part. The code is
>>>
>>> program auditmpi
>>>
>>> use mpi, only : &
>>> MPI_COMM_WORLD, &
>>> MPI_REAL, &
>>> MPI_STATUS_SIZE, &
>>> MPI_BSEND_OVERHEAD, &
>>> MPI_SUM, &
>>> MPI_UNDEFINED
>>>
>>> implicit none
>>>
>>> integer, parameter :: repeats=2
>>>
>>> integer :: rank, nproc,ierr,rpt
>>> real :: input(100000),output(100000)
>>> integer :: status (MPI_STATUS_SIZE)
>>> integer :: statuses (MPI_STATUS_SIZE,repeats)
>>> integer :: request
>>> logical :: odd
>>> integer, allocatable :: recvcounts(:),sendcounts(:),displs(:)
>>>
>>> call MPI_INIT(ierr)
>>>
>>> call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
>>> call MPI_COMM_SIZE(MPI_COMM_WORLD, nproc, ierr)
>>>
>>> if (rank == 0) print *,"auditmpi start"
>>>
>>> if (mod(rank,2) == 0) then
>>> odd=.false.
>>> else
>>> odd=.true.
>>> end if
>>>
>>> allocate (recvcounts(nproc))
>>> recvcounts(:) = 100000/nproc
>>>
>>> allocate (sendcounts(nproc))
>>> sendcounts(:) = 100000/nproc
>>>
>>> allocate (displs(nproc))
>>> displs(:)=0 ! don't care if we overlap
>>>
>>> if ( .NOT. odd) then
>>> call MPI_IRECV(output,size(output),MPI_REAL,rank+1,1,MPI_COMM_WORLD,
>>> request, ierr)
>>> end if
>>> call MPI_BARRIER(MPI_COMM_WORLD,ierr)
>>> if (odd) then
>>> call
>>> MPI_IRSEND(input,size(input),MPI_REAL,rank-1,1,MPI_COMM_WORLD,request,ierr)
>>> end if
>>> call MPI_WAIT(request,status,ierr)
>>>
>>> do rpt=1,2
>>> call
>>> MPI_ALLTOALLV(input,sendcounts,displs,MPI_REAL,output,recvcounts,displs,MPI
>>> _REAL,MPI_COMM_WORLD,ierr) end do
>>>
>>> call MPI_BARRIER(MPI_COMM_WORLD,ierr)
>>>
>>> deallocate (recvcounts)
>>> deallocate (sendcounts)
>>> deallocate (displs)
>>>
>>> if (rank == 0) print *,"auditmpi finished"
>>>
>>> call MPI_FINALIZE(ierr)
>>>
>>> end program auditmpi
>>>
>>> John
>>> _______________________________________________
>>> discuss mailing list discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list