[mpich-devel] MPICH hangs in MPI_Waitall when MPI_Cancel is used
Jeff Hammond
jeff.science at gmail.com
Thu Jun 4 09:21:53 CDT 2015
I can't tell for sure if this is a correct program, but multiple
members of the MPI Forum suggested it is.
If it is a correct program, it appears to expose a bug in MPICH,
because the MPI_Waitall hangs.
Thanks,
Jeff
$ mpicc -g -Wall -std=c99 cancel-sucks.c && mpiexec -n 4 ./a.out
$ mpichversion
MPICH Version: 3.2b1
MPICH Release date: unreleased development copy
MPICH Device: ch3:nemesis
MPICH configure: CC=gcc-4.9 CXX=g++-4.9 FC=gfortran-4.9
F77=gfortran-4.9 --enable-cxx --enable-fortran
--enable-threads=runtime --enable-g=dbg --with-pm=hydra
--prefix=/opt/mpich/dev/gcc/default --enable-wrapper-rpath
--enable-static --enable-shared
MPICH CC: gcc-4.9 -g -O2
MPICH CXX: g++-4.9 -g -O2
MPICH F77: gfortran-4.9 -g -O2
MPICH FC: gfortran-4.9 -g -O2
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
const int n=1000;
int main(void)
{
MPI_Init(NULL,NULL);
int size, rank;
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (size<2) {
printf("You must use 2 or more processes!\n");
MPI_Finalize();
exit(1);
}
MPI_Request reqs[2*n];
int target = (rank+1)%size;
for (int i=0; i<n; i++) {
MPI_Issend(NULL,0,MPI_BYTE,target,0,MPI_COMM_WORLD,&(reqs[i]));
}
srand((unsigned)(rank+MPI_Wtime()));
int slot = rand()%n;
printf("Cancelling send %d.\n", slot);
MPI_Cancel(&reqs[slot]);
#if 1
MPI_Barrier(MPI_COMM_WORLD);
#endif
int origin = (rank==0) ? (size-1) : (rank-1);
for (int i=0; i<n; i++) {
MPI_Irecv(NULL,0,MPI_BYTE,origin,0,MPI_COMM_WORLD,&(reqs[n+i]));
}
MPI_Status stats[2*n];
MPI_Waitall(2*n,reqs,stats);
for (int i=0; i<n; i++) {
int flag;
MPI_Test_cancelled(&(stats[i]),&flag);
if (flag) {
printf("Status %d indicates cancel was successful.\n", i);
}
}
MPI_Finalize();
return 0;
}
--
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
More information about the devel
mailing list