[mpich-discuss] MPI_REDUCE with MPI_IN_PLACE does not always work
Michael.Rachner at dlr.de
Michael.Rachner at dlr.de
Fri Sep 13 03:31:35 CDT 2013
Dear MPI community,
I found a problem when calling MPI_REDUCE with the keyword MPI_IN_PLACE in my Ftn95-code (using the INTEL-12 Ftn95-compiler).
Depending on the MPI-implementation and the operating system the problem occurs or not.
This is my experience so far:
MPICH2 v.1.4.1p1 on Win7(64-bit) PC : It works
Microsoft-MPI (v. of 12/11/2012) on WIN7(64-bit) PC : It fails (either assuming the contribution of the root to be zero (i.e. silently a wrong result!),
or access violation or floating exception on the root in MPI_REDUCE )
OPENMPI v.1.6.2-2 on WIN7(64-bit) PC : It fails ( in the same manner as with MS-MPI)
OPENMPI v.1.4.3 and 1.6.3 on 2 LINUX-Clusters : It works
INTEL-MPI v. 4.0.3 and 4.1.0 on 2 LINUX-Clusters : It works
My question is: Is the failing possibly caused by an erroneous ('dangerous') Ftn-coding causing some MPI implementations to fail and others not?
Or is the problem actually caused by a bug in different MPI implementions?
This is my Ftn95-coding:
!
subroutine mpiw_reduce_sumfast_real8( rbuffarr, nelem )
!
!===============================================================================
!
! sbr mpiw_reduce_sumfast_real8 is a wrapper for the MPI-routine MPI_REDUCE
! applied for summing element-wise a real(REAL8) 1d-array rbuffarr(nelem)
! from all processes of communicator commSPRAY
! and store the sums on master in the same array rbuffarr(nelem) ,
! i.e. on the master we overwrite the original contribution of the master by:
!
! for i=1..nelem: rbuffarr(i) = SUM_over_iproc (rbuffarr(i) ) , with iproc=1,numprocs
!
!
! mpiw_reduce_sumfast_real8 calls : MPI_REDUCE
!
! last update: 03.09.2013
!===============================================================================
!
use MPIHEADER , only: MPI_SUM, MPI_IN_PLACE
use NUMBER_MODEL, only: INT4,REAL8
use MPARAL , only: lmaster, commSPRAY, ierr_mpi, mpiusertype_REAL8
!
implicit none
!
integer (INT4) , intent(IN) :: nelem
real (REAL8), intent(INOUT), dimension(nelem) :: rbuffarr ! input on master&slaves, result only on master
!
real (REAL8) :: rdummyarr(1)
if(lmaster) then
call MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, mpiusertype_REAL8, MPI_SUM &
,0_INT4, commSPRAY, ierr_mpi )
else ! slaves
call MPI_REDUCE( rbuffarr, rdummyarr, nelem, mpiusertype_REAL8, MPI_SUM &
,0_INT4, commSPRAY, ierr_mpi )
endif
!
return
end subroutine mpiw_reduce_sumfast_real8
Note, that the problem is not dependent on the reduce-operator (here MPI_SUM) chosen.
The problem does not occur, when I apply MPI_REDUCE without the MPI_IN_PLACE option.
As a reference I cite here the MPI 2.2 standard (of Sept 4, 2009, p. 164):
The "in place" option for intracommunicators is specified by passing the value
MPI_IN_PLACE to the argument sendbuf at the root. In such a case, the input data is taken
at the root from the receive buffer, where it will be replaced by the output data.
Does my coding strictly conform to this?
Greetings
Michael Rachner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130913/5d1de560/attachment.html>
More information about the discuss
mailing list