[mpich-devel] MPICH hangs in MPI_Waitall when MPI_Cancel is used
Jeff Hammond
jeff.science at gmail.com
Thu Jun 4 12:19:32 CDT 2015
Thanks for pointing that out. It runs correctly now. Sorry for the
stupid question.
On Thu, Jun 4, 2015 at 11:49 AM, Halim Amer <aamer at anl.gov> wrote:
> Hi Jeff,
>
> I don't think it is a correct program. If the send is correctly canceled
> then the origin has to satisfy the destination with another send. The hang
> is an expected result.
>
> This is what the standard says (P102):
>
> "...or that the send is successfully cancelled, in which case no part of the
> message was received at the destination. Then, any matching receive has to
> be satisfied by another send."
>
> --Halim
>
> Abdelhalim Amer (Halim)
> Postdoctoral Appointee
> MCS Division
> Argonne National Laboratory
>
>
> On 6/4/15 9:21 AM, Jeff Hammond wrote:
>>
>> I can't tell for sure if this is a correct program, but multiple
>> members of the MPI Forum suggested it is.
>>
>> If it is a correct program, it appears to expose a bug in MPICH,
>> because the MPI_Waitall hangs.
>>
>> Thanks,
>>
>> Jeff
>>
>> $ mpicc -g -Wall -std=c99 cancel-sucks.c && mpiexec -n 4 ./a.out
>>
>> $ mpichversion
>> MPICH Version: 3.2b1
>> MPICH Release date: unreleased development copy
>> MPICH Device: ch3:nemesis
>> MPICH configure: CC=gcc-4.9 CXX=g++-4.9 FC=gfortran-4.9
>> F77=gfortran-4.9 --enable-cxx --enable-fortran
>> --enable-threads=runtime --enable-g=dbg --with-pm=hydra
>> --prefix=/opt/mpich/dev/gcc/default --enable-wrapper-rpath
>> --enable-static --enable-shared
>> MPICH CC: gcc-4.9 -g -O2
>> MPICH CXX: g++-4.9 -g -O2
>> MPICH F77: gfortran-4.9 -g -O2
>> MPICH FC: gfortran-4.9 -g -O2
>>
>>
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <mpi.h>
>>
>> const int n=1000;
>>
>> int main(void)
>> {
>> MPI_Init(NULL,NULL);
>>
>> int size, rank;
>> MPI_Comm_size(MPI_COMM_WORLD, &size);
>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>> if (size<2) {
>> printf("You must use 2 or more processes!\n");
>> MPI_Finalize();
>> exit(1);
>> }
>>
>> MPI_Request reqs[2*n];
>>
>> int target = (rank+1)%size;
>> for (int i=0; i<n; i++) {
>> MPI_Issend(NULL,0,MPI_BYTE,target,0,MPI_COMM_WORLD,&(reqs[i]));
>> }
>>
>> srand((unsigned)(rank+MPI_Wtime()));
>> int slot = rand()%n;
>> printf("Cancelling send %d.\n", slot);
>> MPI_Cancel(&reqs[slot]);
>>
>> #if 1
>> MPI_Barrier(MPI_COMM_WORLD);
>> #endif
>>
>> int origin = (rank==0) ? (size-1) : (rank-1);
>> for (int i=0; i<n; i++) {
>> MPI_Irecv(NULL,0,MPI_BYTE,origin,0,MPI_COMM_WORLD,&(reqs[n+i]));
>> }
>>
>> MPI_Status stats[2*n];
>> MPI_Waitall(2*n,reqs,stats);
>>
>> for (int i=0; i<n; i++) {
>> int flag;
>> MPI_Test_cancelled(&(stats[i]),&flag);
>> if (flag) {
>> printf("Status %d indicates cancel was successful.\n", i);
>> }
>> }
>>
>> MPI_Finalize();
>> return 0;
>> }
>>
>>
> _______________________________________________
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/devel
--
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
More information about the devel
mailing list