[mpich-discuss] Issues with shared memory window
Jonathan Blair
qbit at utexas.edu
Fri Jul 25 23:07:16 CDT 2014
Hi MPICH users,
I've been having issues with MPI_Win_allocate_shared(). I believe my use
case is compliant with the standard, but I am not ruling out ignorance
on my part as the fault.
In my project, one task allocates the memory to be shared, and the other
tasks attach to the shared memory. The allocation function returns
MPI_SUCCESS, as do all calls of MPI_Win_shared_query(). The size and
displacement unit match expected values. However, the picture of the
memory is nonuniform.
I'm running this on a shared memory system (the communicator is
intra-node, currently being tested on a desktop), with MPICH 3.1.2
installed, passing all internal tests during installation.
I notice that MPI_Free_mem() reports errors and I believe MPI_Finalize()
causes a segfault, but I'm not sure if this is specifically related to
the issue at hand.
I have included a minimal test case below. Does anyone have any insight
into my problem?
Thanks for you input,
Jonathan
[begin file test.cpp]
#include <stdlib.h>
#include <stdio.h>
#include <mpi.h>
using namespace std;
int main(int argc, char *argv[]){
int rank;
int color = 1;
int ierr;
int *arr = (int *) malloc( 0 );
int disp_unit;
MPI_Aint size = 2048;
MPI_Aint reportedSize = 0;
MPI_Comm comm;
ierr = MPI_Init( &argc, &argv );
ierr = MPI_Comm_rank( MPI_COMM_WORLD, &rank );
ierr = MPI_Comm_split( MPI_COMM_WORLD, color, rank, &comm );
MPI_Win *win = (MPI_Win *) malloc( 0 );
if (rank == 0){
ierr = MPI_Win_allocate_shared( \
size, \
(int) sizeof(int), \
MPI_INFO_NULL, \
comm, \
(void *) arr, \
win );
printf( "Rank: %i ierr from MPI_Win_allocate_shared = %i\n", rank,
ierr);
ierr = MPI_Barrier( comm );
for (int i=0; i < size; i++){
arr[i] = i;
}
printf( "Rank: %i arr[0] = %i, arr[1] = %i\n", rank, arr[0], arr[1] );
ierr = MPI_Barrier( comm );
}
else{
ierr = MPI_Win_allocate_shared( \
(MPI_Aint) 0, \
(int) sizeof(int), \
MPI_INFO_NULL, \
comm, \
(void *) arr, \
win );
printf( "Rank: %i ierr from MPI_Win_allocate_shared = %i\n", rank,
ierr);
ierr = MPI_Win_shared_query( \
*win, \
(int) 0, \
&reportedSize, \
&disp_unit, \
(void *) arr );
printf( "Rank: %i ierr from MPI_Win_shared_query = %i\n", rank, ierr);
printf( "Rank: %i reportedSize = %i\n", rank, (int) reportedSize);
printf( "Rank: %i disp_unit = %i\n", rank, disp_unit);
ierr = MPI_Barrier( comm );
ierr = MPI_Barrier( comm );
printf( "Rank: %i arr[0] = %i, arr[1] = %i\n", rank, arr[0], arr[1] );
}
MPI_Free_mem((void *) win);
ierr = MPI_Finalize();
return 0;
}
[end file test.cpp]
[begin shell output]
$ mpirun -n 2 ./test
Rank: 0 ierr from MPI_Win_allocate_shared = 0
Rank: 1 ierr from MPI_Win_allocate_shared = 0
Rank: 1 ierr from MPI_Win_shared_query = 0
Rank: 1 reportedSize = 2048
Rank: 1 disp_unit = 4
Rank: 0 arr[0] = 0, arr[1] = 1
Rank: 1 arr[0] = 1996775424, arr[1] = 32592
[0] Block at address 0x000000000093f190 is corrupted; cannot free;
may be block not allocated with MPL_trmalloc or MALLOC
called in /path/to/mpich-3.1.2/src/mpid/ch3/src/ch3u_rma_ops.c at line 493
[1] Block at address 0x0000000000ec8190 is corrupted; cannot free;
may be block not allocated with MPL_trmalloc or MALLOC
called in /path/to/mpich-3.1.2/src/mpid/ch3/src/ch3u_rma_ops.c at line 493
[1] 56 at [0x0000000000eccb78],
ich-3.1.2/src/util/wrappers/mpiu_shm_wrappers.h[188]
[1] 24 at [0x0000000000eccab8],
ich-3.1.2/src/util/wrappers/mpiu_shm_wrappers.h[217]
[1] 56 at [0x0000000000ecc9d8],
ich-3.1.2/src/util/wrappers/mpiu_shm_wrappers.h[188]
[1] 24 at [0x0000000000ecc918],
ich-3.1.2/src/util/wrappers/mpiu_shm_wrappers.h[217]
[1] 8 at [0x0000000000ecc788],
src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[131]
[1] 8 at [0x0000000000eca3a8],
src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[127]
[1] 8 at [0x0000000000ec74e8],
src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[123]
[1] 16 at [0x0000000000ecc6c8],
src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[120]
[1] 16 at [0x0000000000ecc608],
src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[117]
[1] 16 at [0x0000000000ecc548],
src/mpid/ch3/channels/nemesis/src/ch3_win_fns.c[113]
[1] 48 at [0x0000000000ecbe98],
h/mpich/mpich-3.1.2/src/mpid/ch3/src/mpid_rma.c[301]
[1] 32 at [0x0000000000ecc478],
ch/mpich/mpich-3.1.2/src/mpid/ch3/src/mpid_vc.c[122]
[1] 8 at [0x0000000000ecc2f8],
mpich/mpich-3.1.2/src/util/procmap/local_proc.c[93]
[1] 8 at [0x0000000000ecc248],
mpich/mpich-3.1.2/src/util/procmap/local_proc.c[92]
[1] 32 at [0x0000000000ecc3a8],
ch/mpich/mpich-3.1.2/src/mpid/ch3/src/mpid_vc.c[122]
[1] 8 at [0x0000000000ecc198],
mpich/mpich-3.1.2/src/util/procmap/local_proc.c[93]
[1] 8 at [0x0000000000ecc0e8],
mpich/mpich-3.1.2/src/util/procmap/local_proc.c[92]
[1] 32 at [0x0000000000ecc018],
ch/mpich/mpich-3.1.2/src/mpid/ch3/src/mpid_vc.c[122]
[1] 504 at [0x0000000000ecafc8],
earch/mpich/mpich-3.1.2/src/mpi/comm/commutil.c[281]
[1] 504 at [0x0000000000ecaa88],
earch/mpich/mpich-3.1.2/src/mpi/comm/commutil.c[281]
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 8822 RUNNING AT Machine
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
(signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
[end shell output]
More information about the discuss
mailing list