[mpich-discuss] Race condition in MPI3 (Fortran, shared memory)
C.Friedrich
c.friedrich at fz-juelich.de
Wed Apr 13 10:28:58 CDT 2016
Hello all,
I have a problem with race conditions when several processes on the same
node calculate an array a(:) held in shared memory. I understand that in
the case of Fortran one has to make sure that the call to MPI_WIN_FENCE
is accompanied with a call to MPI_F_SYNC_REG(a) to make sure that the
memory is updated. However, as the routine MPI_F_SYNC_REG does not seem
to be contained in the library I have (MPICH 3.2), I have used a
different approach, namely that I move MPI_WIN_FENCE into an external
subroutine MPE_WIN_FENCE that contains the array a(:) as an additional
argument. In this way, the Fortran compiler should be forced to
guarantee memory consistency, to my understanding.
See the following test case. The program allocates the integer array
a(100) in shared memory. Then, each process does
a = a + 1
10000 times. When running the program with two processes, the final
result of the array should thus be 20000, 20000, ...
In order to protect "a=a+1" I have added calls to MPE_WIN_FENCE on both
sides. However, race conflicts seem to persist, because the result is
something like 18442, 18883, 19126, ...
I could not find anything appropriate in the archives.
Thanks in advance
Christoph
---
program test
use iso_c_binding
implicit none
interface mpe_win_fence
subroutine mpe_win_fence(a,assert,win,err)
implicit none
integer, intent(inout) :: a(*)
integer, intent(in) :: assert,win
integer, intent(out) :: err
end subroutine mpe_win_fence
end interface
include '../include/mpif.h'
integer :: i,j
integer :: err,disp
integer :: Nsize,Nrank,Ncomm,Nwin
integer(kind=mpi_address_kind) :: size
type(c_ptr) :: aptr
integer, pointer :: a(:)
call mpi_init(err)
call mpi_comm_split_type(mpi_comm_world,mpi_comm_type_shared,0,
& mpi_info_null,Ncomm,err)
call mpi_comm_size(Ncomm,Nsize,err)
call mpi_comm_rank(Ncomm,Nrank,err)
size = 0
if(Nrank.eq.0) size = 100 * 4
disp = 1
call mpi_win_allocate_shared(size,disp,mpi_info_null,Ncomm,aptr,
& Nwin,err)
if(Nrank.ne.0) then
call mpi_win_shared_query(Nwin,0,size,disp,aptr,err)
endif
call c_f_pointer(aptr,a,(/100/))
if(Nrank.eq.0) a = 0
call mpe_win_fence(a,0,Nwin,err)
do i = 1,10000
call mpe_win_fence(a,0,Nwin,err)
a = a + 1
call mpe_win_fence(a,0,Nwin,err)
enddo
if(Nrank.eq.0) then
write(*,*) a
endif
call mpi_win_free(Nwin,err)
call mpi_comm_free(Ncomm,err)
call mpi_finalize(err)
end
subroutine mpe_win_fence(a,assert,win,err)
implicit none
real(8), intent(inout) :: a(*)
integer, intent(in) :: assert,win
integer, intent(out) :: err
call mpi_win_fence(assert,win,err)
end
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list