[mpich-discuss] Non-blocking Collectives
Pavan Balaji
balaji at mcs.anl.gov
Wed Jun 26 15:00:20 CDT 2013
Hi Jiri,
The completion of MPI_IBARRIER indicates that all processes have called
MPI_IBARRIER. This part is correct.
However, the specification does not say that MPI_WAIT on one process has
to complete before others have called MPI_WAIT. That's related to
asynchronous progress and is a quality of implementation issue.
-- Pavan
On 06/26/2013 02:54 PM, Jiri Simsa wrote:
> Hi,
>
> I have a question about the semantics of non-blocking collective
> communication. For example, let's consider MPI_Ibarrier. The MPI 3.0
> standard specifies that:
>
> "MPI_IBARRIER is a nonblocking version of MPI_BARRIER. By calling
> MPI_IBARRIER, a process notifies that it has reached the barrier. The
> call returns immediately, indepen- dent of whether other processes have
> called MPI_IBARRIER. The usual barrier semantics are enforced at the
> corresponding completion operation (test or wait), which in the intra-
> communicator case will complete only after all other processes in the
> communicator have called MPI_IBARRIER. In the intercommunicator case, it
> will complete when all processes in the remote group have called
> MPI_IBARRIER."
>
> My understanding of the standard is that that MPI_Wait(&request,
> &status), where request has been previously passed into MPI_Ibarrier,
> returns after all processes in the respective intra-communicator called
> MPI_Ibarrier. However, the mpich-3.0.4 library, seems to in some cases
> wait for all processes in the respective intra-communicator to call
> MPI_Wait. Here is an example that demonstrates this behavior:
>
> #include <mpi.h>
> #include <unistd.h>
>
> int main( int argc, char *argv[]) {
> MPI_Request request;
> MPI_Status status;
> MPI_Init(&argc, &argv );
> int myrank;
> MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
> if (myrank == 0) {
> MPI_Ibarrier(MPI_COMM_WORLD, &request);
> MPI_Wait(&request, &status);
> printf("%d, Completed barrier.\n", myrank);
> } else {
> MPI_Ibarrier(MPI_COMM_WORLD, &request);
> sleep(1);
> MPI_Wait(&request, &status);
> printf("%d, Completed barrier.\n", myrank);
> }
> MPI_Finalize();
> return 0;
> }
>
> When executed with "mpiexec -n 2 ./example", I see the expected output
> and timing. However, when executed with "mpiexec -n 3 ./example", the
> call to MPI_Wait in process 0 returns only after the other processes
> wake up from sleep() and call MPI_Wait.
>
> Isn't this a violation of the standard?
>
> Best,
>
> --Jiri Simsa
>
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the discuss
mailing list