[mpich-discuss] MPI_Cancel + MPI_Wait stalls when using ch4, not when using ch3
Edric Ellis
eellis at mathworks.com
Wed Jan 19 11:21:24 CST 2022
Hi,
Running one of our test programs using MPICH 3.4.3 and ch4:ofi, I notice that MPI_Wait on an MPI_Request that has been MPI_Cancelled never completes (it does when using ch3). (The documentation for MPI_Cancel states "If a communication is marked for cancellation, then a MPI_WAIT call for that communication is guaranteed to return, irrespective of the activities of other processes (i.e., MPI_WAIT behaves as a local function)")
Here's a simple example:
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
void check(int const value) {
if (value != MPI_SUCCESS) {
fprintf(stderr, "Failed.\n");
exit(1);
}
}
int main(int argc, char** argv) {
MPI_Request r1;
int payload = 42;
int result;
check(MPI_Init(0,0));
check(MPI_Issend(&payload, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &r1));
check(MPI_Test(&r1, &result, MPI_STATUS_IGNORE));
fprintf(stdout, "MPI_Test result: %d\n", result);
check(MPI_Cancel(&r1));
check(MPI_Wait(&r1, MPI_STATUS_IGNORE));
MPI_Finalize();
return 0;
}
This stalls in MPI_Wait when executed using "mpiexec -n 1 ./a.out".
Cheers,
Edric.
More information about the discuss
mailing list