[mpich-discuss] Issues with shared memory window

Jonathan Blair qbit at utexas.edu
Fri Jul 25 23:07:16 CDT 2014


Hi MPICH users,

I've been having issues with MPI_Win_allocate_shared(). I believe my use 
case is compliant with the standard, but I am not ruling out ignorance 
on my part as the fault.

In my project, one task allocates the memory to be shared, and the other 
tasks attach to the shared memory. The allocation function returns 
MPI_SUCCESS, as do all calls of MPI_Win_shared_query(). The size and 
displacement unit match expected values. However, the picture of the 
memory is nonuniform.

I'm running this on a shared memory system (the communicator is 
intra-node, currently being tested on a desktop), with MPICH 3.1.2 
installed, passing all internal tests during installation.

I notice that MPI_Free_mem() reports errors and I believe MPI_Finalize() 
causes a segfault, but I'm not sure if this is specifically related to 
the issue at hand.

I have included a minimal test case below. Does anyone have any insight 
into my problem?

Thanks for you input,
Jonathan


[begin file test.cpp]
#include <stdlib.h>
#include <stdio.h>
#include <mpi.h>

using namespace std;

int main(int argc, char *argv[]){

   int rank;
   int color = 1;
   int ierr;
   int *arr = (int *) malloc( 0 );
   int disp_unit;

   MPI_Aint size = 2048;
   MPI_Aint reportedSize = 0;
   MPI_Comm comm;

   ierr = MPI_Init( &argc, &argv );
   ierr = MPI_Comm_rank( MPI_COMM_WORLD, &rank );
   ierr = MPI_Comm_split( MPI_COMM_WORLD, color, rank, &comm );

   MPI_Win *win = (MPI_Win *) malloc( 0 );

   if (rank == 0){
     ierr = MPI_Win_allocate_shared( \
       size, \
       (int) sizeof(int), \
       MPI_INFO_NULL, \
       comm, \
       (void *) arr, \
       win );

     printf( "Rank: %i ierr from MPI_Win_allocate_shared = %i\n", rank, 
ierr);

     ierr = MPI_Barrier( comm );

     for (int i=0; i < size; i++){
       arr[i] = i;
     }

     printf( "Rank: %i arr[0] = %i, arr[1] = %i\n", rank, arr[0], arr[1] );

     ierr = MPI_Barrier( comm );
   }
   else{
     ierr = MPI_Win_allocate_shared( \
       (MPI_Aint) 0, \
       (int) sizeof(int), \
       MPI_INFO_NULL, \
       comm, \
       (void *) arr, \
       win );

     printf( "Rank: %i ierr from MPI_Win_allocate_shared = %i\n", rank, 
ierr);

     ierr = MPI_Win_shared_query( \
       *win, \
       (int) 0, \
       &reportedSize, \
       &disp_unit, \
       (void *) arr );

     printf( "Rank: %i ierr from MPI_Win_shared_query = %i\n", rank, ierr);
     printf( "Rank: %i reportedSize = %i\n", rank, (int) reportedSize);
     printf( "Rank: %i disp_unit = %i\n", rank, disp_unit);

     ierr = MPI_Barrier( comm );

     ierr = MPI_Barrier( comm );
     printf( "Rank: %i arr[0] = %i, arr[1] = %i\n", rank, arr[0], arr[1] );
   }

   MPI_Free_mem((void *) win);
   ierr = MPI_Finalize();
   return 0;
}
[end file test.cpp]



[begin shell output]
$ mpirun -n 2 ./test
Rank: 0 ierr from MPI_Win_allocate_shared = 0
Rank: 1 ierr from MPI_Win_allocate_shared = 0
Rank: 1 ierr from MPI_Win_shared_query = 0
Rank: 1 reportedSize = 2048
Rank: 1 disp_unit = 4
Rank: 0 arr[0] = 0, arr[1] = 1
Rank: 1 arr[0] = 1996775424, arr[1] = 32592
[0] Block at address 0x000000000093f190 is corrupted; cannot free;
may be block not allocated with MPL_trmalloc or MALLOC
called in /path/to/mpich-3.1.2/src/mpid/ch3/src/ch3u_rma_ops.c at line 493
[1] Block at address 0x0000000000ec8190 is corrupted; cannot free;
may be block not allocated with MPL_trmalloc or MALLOC
called in /path/to/mpich-3.1.2/src/mpid/ch3/src/ch3u_rma_ops.c at line 493
[1] 56 at [0x0000000000eccb78], 
ich-3.1.2/src/util/wrappers/mpiu_shm_wrappers.h[188]
[1] 24 at [0x0000000000eccab8], 
ich-3.1.2/src/util/wrappers/mpiu_shm_wrappers.h[217]
[1] 56 at [0x0000000000ecc9d8], 
ich-3.1.2/src/util/wrappers/mpiu_shm_wrappers.h[188]
[1] 24 at [0x0000000000ecc918], 
ich-3.1.2/src/util/wrappers/mpiu_shm_wrappers.h[217]
[1] 8 at [0x0000000000ecc788], 
src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[131]
[1] 8 at [0x0000000000eca3a8], 
src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[127]
[1] 8 at [0x0000000000ec74e8], 
src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[123]
[1] 16 at [0x0000000000ecc6c8], 
src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[120]
[1] 16 at [0x0000000000ecc608], 
src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[117]
[1] 16 at [0x0000000000ecc548], 
src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[113]
[1] 48 at [0x0000000000ecbe98], 
h/mpich/mpich-3.1.2/src/mpid/ch3/src/mpid_rma.c[301]
[1] 32 at [0x0000000000ecc478], 
ch/mpich/mpich-3.1.2/src/mpid/ch3/src/mpid_vc.c[122]
[1] 8 at [0x0000000000ecc2f8], 
mpich/mpich-3.1.2/src/util/procmap/local_proc.c[93]
[1] 8 at [0x0000000000ecc248], 
mpich/mpich-3.1.2/src/util/procmap/local_proc.c[92]
[1] 32 at [0x0000000000ecc3a8], 
ch/mpich/mpich-3.1.2/src/mpid/ch3/src/mpid_vc.c[122]
[1] 8 at [0x0000000000ecc198], 
mpich/mpich-3.1.2/src/util/procmap/local_proc.c[93]
[1] 8 at [0x0000000000ecc0e8], 
mpich/mpich-3.1.2/src/util/procmap/local_proc.c[92]
[1] 32 at [0x0000000000ecc018], 
ch/mpich/mpich-3.1.2/src/mpid/ch3/src/mpid_vc.c[122]
[1] 504 at [0x0000000000ecafc8], 
earch/mpich/mpich-3.1.2/src/mpi/comm/commutil.c[281]
[1] 504 at [0x0000000000ecaa88], 
earch/mpich/mpich-3.1.2/src/mpi/comm/commutil.c[281]

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 8822 RUNNING AT Machine
=   EXIT CODE: 139
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault 
(signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
[end shell output]



More information about the discuss mailing list