[mpich-discuss] Issues with shared memory window

rreddypsc at gmail.com rreddypsc at gmail.com
Sat Jul 26 09:38:39 CDT 2014


Do you think the printf and Barrier in the non-zero ranks should be swapped?

Looks like the window has been freed before the the ranks get to it?

Thanks,
--rr

--On Friday, July 25, 2014 11:07 PM -0500 Jonathan Blair <qbit at utexas.edu> 
wrote:

> Hi MPICH users,
>
> I've been having issues with MPI_Win_allocate_shared(). I believe my use
> case is compliant with the standard, but I am not ruling out ignorance on
> my part as the fault.
>
> In my project, one task allocates the memory to be shared, and the other
> tasks attach to the shared memory. The allocation function returns
> MPI_SUCCESS, as do all calls of MPI_Win_shared_query(). The size and
> displacement unit match expected values. However, the picture of the
> memory is nonuniform.
>
> I'm running this on a shared memory system (the communicator is
> intra-node, currently being tested on a desktop), with MPICH 3.1.2
> installed, passing all internal tests during installation.
>
> I notice that MPI_Free_mem() reports errors and I believe MPI_Finalize()
> causes a segfault, but I'm not sure if this is specifically related to
> the issue at hand.
>
> I have included a minimal test case below. Does anyone have any insight
> into my problem?
>
> Thanks for you input,
> Jonathan
>
>
> [begin file test.cpp]
># include <stdlib.h>
># include <stdio.h>
># include <mpi.h>
>
> using namespace std;
>
> int main(int argc, char *argv[]){
>
>    int rank;
>    int color = 1;
>    int ierr;
>    int *arr = (int *) malloc( 0 );
>    int disp_unit;
>
>    MPI_Aint size = 2048;
>    MPI_Aint reportedSize = 0;
>    MPI_Comm comm;
>
>    ierr = MPI_Init( &argc, &argv );
>    ierr = MPI_Comm_rank( MPI_COMM_WORLD, &rank );
>    ierr = MPI_Comm_split( MPI_COMM_WORLD, color, rank, &comm );
>
>    MPI_Win *win = (MPI_Win *) malloc( 0 );
>
>    if (rank == 0){
>      ierr = MPI_Win_allocate_shared( \
>        size, \
>        (int) sizeof(int), \
>        MPI_INFO_NULL, \
>        comm, \
>        (void *) arr, \
>        win );
>
>      printf( "Rank: %i ierr from MPI_Win_allocate_shared = %i\n", rank,
> ierr);
>
>      ierr = MPI_Barrier( comm );
>
>      for (int i=0; i < size; i++){
>        arr[i] = i;
>      }
>
>      printf( "Rank: %i arr[0] = %i, arr[1] = %i\n", rank, arr[0], arr[1]
> );
>
>      ierr = MPI_Barrier( comm );
>    }
>    else{
>      ierr = MPI_Win_allocate_shared( \
>        (MPI_Aint) 0, \
>        (int) sizeof(int), \
>        MPI_INFO_NULL, \
>        comm, \
>        (void *) arr, \
>        win );
>
>      printf( "Rank: %i ierr from MPI_Win_allocate_shared = %i\n", rank,
> ierr);
>
>      ierr = MPI_Win_shared_query( \
>        *win, \
>        (int) 0, \
>        &reportedSize, \
>        &disp_unit, \
>        (void *) arr );
>
>      printf( "Rank: %i ierr from MPI_Win_shared_query = %i\n", rank,
> ierr);
>      printf( "Rank: %i reportedSize = %i\n", rank, (int) reportedSize);
>      printf( "Rank: %i disp_unit = %i\n", rank, disp_unit);
>
>      ierr = MPI_Barrier( comm );
>
>      ierr = MPI_Barrier( comm );
>      printf( "Rank: %i arr[0] = %i, arr[1] = %i\n", rank, arr[0], arr[1]
> );
>    }
>
>    MPI_Free_mem((void *) win);
>    ierr = MPI_Finalize();
>    return 0;
> }
> [end file test.cpp]
>
>
>
> [begin shell output]
> $ mpirun -n 2 ./test
> Rank: 0 ierr from MPI_Win_allocate_shared = 0
> Rank: 1 ierr from MPI_Win_allocate_shared = 0
> Rank: 1 ierr from MPI_Win_shared_query = 0
> Rank: 1 reportedSize = 2048
> Rank: 1 disp_unit = 4
> Rank: 0 arr[0] = 0, arr[1] = 1
> Rank: 1 arr[0] = 1996775424, arr[1] = 32592
> [0] Block at address 0x000000000093f190 is corrupted; cannot free;
> may be block not allocated with MPL_trmalloc or MALLOC
> called in /path/to/mpich-3.1.2/src/mpid/ch3/src/ch3u_rma_ops.c at line 493
> [1] Block at address 0x0000000000ec8190 is corrupted; cannot free;
> may be block not allocated with MPL_trmalloc or MALLOC
> called in /path/to/mpich-3.1.2/src/mpid/ch3/src/ch3u_rma_ops.c at line 493
> [1] 56 at [0x0000000000eccb78],
> ich-3.1.2/src/util/wrappers/mpiu_shm_wrappers.h[188]
> [1] 24 at [0x0000000000eccab8],
> ich-3.1.2/src/util/wrappers/mpiu_shm_wrappers.h[217]
> [1] 56 at [0x0000000000ecc9d8],
> ich-3.1.2/src/util/wrappers/mpiu_shm_wrappers.h[188]
> [1] 24 at [0x0000000000ecc918],
> ich-3.1.2/src/util/wrappers/mpiu_shm_wrappers.h[217]
> [1] 8 at [0x0000000000ecc788],
> src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[131]
> [1] 8 at [0x0000000000eca3a8],
> src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[127]
> [1] 8 at [0x0000000000ec74e8],
> src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[123]
> [1] 16 at [0x0000000000ecc6c8],
> src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[120]
> [1] 16 at [0x0000000000ecc608],
> src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[117]
> [1] 16 at [0x0000000000ecc548],
> src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[113]
> [1] 48 at [0x0000000000ecbe98],
> h/mpich/mpich-3.1.2/src/mpid/ch3/src/mpid_rma.c[301]
> [1] 32 at [0x0000000000ecc478],
> ch/mpich/mpich-3.1.2/src/mpid/ch3/src/mpid_vc.c[122]
> [1] 8 at [0x0000000000ecc2f8],
> mpich/mpich-3.1.2/src/util/procmap/local_proc.c[93]
> [1] 8 at [0x0000000000ecc248],
> mpich/mpich-3.1.2/src/util/procmap/local_proc.c[92]
> [1] 32 at [0x0000000000ecc3a8],
> ch/mpich/mpich-3.1.2/src/mpid/ch3/src/mpid_vc.c[122]
> [1] 8 at [0x0000000000ecc198],
> mpich/mpich-3.1.2/src/util/procmap/local_proc.c[93]
> [1] 8 at [0x0000000000ecc0e8],
> mpich/mpich-3.1.2/src/util/procmap/local_proc.c[92]
> [1] 32 at [0x0000000000ecc018],
> ch/mpich/mpich-3.1.2/src/mpid/ch3/src/mpid_vc.c[122]
> [1] 504 at [0x0000000000ecafc8],
> earch/mpich/mpich-3.1.2/src/mpi/comm/commutil.c[281]
> [1] 504 at [0x0000000000ecaa88],
> earch/mpich/mpich-3.1.2/src/mpi/comm/commutil.c[281]
>
> =========================================================================
> ==========
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   PID 8822 RUNNING AT Machine
> =   EXIT CODE: 139
> =   CLEANING UP REMAINING PROCESSES
> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> =========================================================================
> ==========
> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
> (signal 11)
> This typically refers to a problem with your application.
> Please see the FAQ page for debugging suggestions
> [end shell output]
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss





More information about the discuss mailing list