[mpich-discuss] Non-blocking Collectives

Jiri Simsa jsimsa at cs.cmu.edu
Wed Jun 26 14:54:14 CDT 2013


Hi,

I have a question about the semantics of non-blocking collective
communication. For example, let's consider MPI_Ibarrier. The MPI 3.0
standard specifies that:

"MPI_IBARRIER is a nonblocking version of MPI_BARRIER. By calling
MPI_IBARRIER, a process notifies that it has reached the barrier. The call
returns immediately, indepen- dent of whether other processes have called
MPI_IBARRIER. The usual barrier semantics are enforced at the corresponding
completion operation (test or wait), which in the intra- communicator case
will complete only after all other processes in the communicator have
called MPI_IBARRIER. In the intercommunicator case, it will complete when
all processes in the remote group have called MPI_IBARRIER."

My understanding of the standard is that that MPI_Wait(&request, &status),
where request has been previously passed into MPI_Ibarrier, returns after
all processes in the respective intra-communicator called MPI_Ibarrier.
However, the mpich-3.0.4 library, seems to in some cases wait for all
processes in the respective intra-communicator to call MPI_Wait. Here is an
example that demonstrates this behavior:

#include <mpi.h>
#include <unistd.h>

int main( int argc, char *argv[]) {
  MPI_Request request;
  MPI_Status status;
  MPI_Init(&argc, &argv );
  int myrank;
  MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
  if (myrank == 0) {
    MPI_Ibarrier(MPI_COMM_WORLD, &request);
    MPI_Wait(&request, &status);
    printf("%d, Completed barrier.\n", myrank);
  } else {
    MPI_Ibarrier(MPI_COMM_WORLD, &request);
    sleep(1);
    MPI_Wait(&request, &status);
    printf("%d, Completed barrier.\n", myrank);
  }
  MPI_Finalize();
  return 0;
}

When executed with "mpiexec -n 2 ./example", I see the expected output and
timing. However, when executed with "mpiexec -n 3 ./example", the call to
MPI_Wait in process 0 returns only after the other processes wake up from
sleep() and call MPI_Wait.

Isn't this a violation of the standard?

Best,

--Jiri Simsa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130626/b4935646/attachment.html>


More information about the discuss mailing list