[mpich-discuss] Questions about MPICH multi-thread support

Guilherme Valarini guilherme.a.valarini at gmail.com
Wed Jan 27 07:40:50 CST 2021


Thanks for the answer. I would like to ask just some follow-up questions
though.

*Question 1.* In our application there are two modules that run
concurrently and rely on MPI (using two different communicators for
isolation purposes). During the development time we observed the following
behavior: when one of the modules sends a large buffer (say +1GB), the
other one encountered a kind of contention problem where its messages were
held back through the duration of the large data transfer, even though
second module messages were really small. Is this a known behavior? Is
there any kind of message fragmentation implemented in MPICH that would
allow concurrent message progression? If not, are there any plans to
implement such a feature?

*Question 2.* We already have used one of the scenarios described in the
previous email, where one thread of a process A sends a message to one of
multiple threads on a process B, all of which are waiting on the same
message triple through non-blocking receives. The expected behaviour would
be that only one of the threads from process B would receive the message,
but, during our tests we have faced a failure where a segfault would be
thrown from inside *MPI_Recv *function at one of the receiving threads. Is
the expected behavior correct? Is there any known issue with this use case
that would trigger the described problem? I know this can probably be the
application's fault, I just want to check if there is any known issue on
the matter.

Thanks,
Guilherme Valarini

Em ter., 26 de jan. de 2021 às 18:55, Zhou, Hui <zhouh at anl.gov> escreveu:

>
>
> > *Question 1*. I am aware of the message envelope tuples (source,
> destination, tag, communicator) that may be used to identify and filter
> different messages. While using MPICH, can I rely on such information in
> order to guarantee that the correct messages will reach/be matched with
> their destinations in a multiprocess multithreaded program?
>
>
>
> Yes, you can rely on the correctness of MPI implementation. That is, if
> your `rank, tag, communicator` matches the messages correctly to your
> intention, then you can trust that a correct implementation will do that
> even in a multi-threaded setting.
>
>
>
> The care need be taken that when your `rank, tag, ,communicator` doesn’t
> uniquely match, and if you send or receive in different threads, the order
> of matching is not certain, and it may surprise you. But that is just the
> perils of concurrent programming.
>
>
>
> > *Question 2*. Is there a problem when mixing blocking and non-blocking
> calls on opposite sides of a message? Even in the multithreaded scenarios
> described previously? (e.g. matching a *MPI_Isend* with a *MPI_Recv*, and
> vice-versa.)
>
>
>
> No, there is no problem in that.
>
>
>
> --
> Hui Zhou
>
>
>
>
>
> *From: *Guilherme Valarini via discuss <discuss at mpich.org>
> *Date: *Tuesday, January 26, 2021 at 3:24 PM
> *To: *discuss at mpich.org <discuss at mpich.org>
> *Cc: *Guilherme Valarini <guilherme.a.valarini at gmail.com>
> *Subject: *[mpich-discuss] Questions about MPICH multi-thread support
>
> Dear MPICH community,
>
>
>
> I am currently developing an event system built upon MPI and I have a few
> questions about the current state of multi-thread support from MPICH in the
> *MPI_THREAD_MULTIPLE* mode.
>
>
>
> *Question 1*. I am aware of the message envelope tuples (source,
> destination, tag, communicator) that may be used to identify and filter
> different messages. While using MPICH, can I rely on such information in
> order to guarantee that the correct messages will reach/be matched with
> their destinations in a multiprocess multithreaded program? I know this is
> quite a broad question, but I am more interested in the analysis of the
> following scenarios.
>
>
>
>    - Two threads in a process A want to send different messages to two
>    different processes, B and C (each with one thread), while using the
>    *same* *communicator* *and* *tag*. Code example:
>
>   // One process with N threads to N process with one thread (same tag)
>   if (rank == 0) {
>     std::thread t1([&]() {
>       // Generate data ...
>       MPI_Send(data, 1, MPI_CHAR, 1, 0, MPI_COMM_WORLD);
>     });
>     std::thread t2([&]() {
>       // Generate data ...
>       MPI_Send(data, 1, MPI_CHAR, 2, 0, MPI_COMM_WORLD);
>     });
>     t1.join(); t2.join();
>   } else {
>     MPI_Recv(data, 1, MPI_CHAR, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
>   }
>
>
>    - The same scenario as above, but now each thread uses a *different
>    tag* to communicate to their respective process while only sharing the
>    *same* *communicator*. Code example:
>
>   // One process with N threads to N process with one thread (different
> tags)
>   if (rank == 0) {
>     std::thread t1([&]() {
>       // Generate data ...
>       MPI_Send(data, 1, MPI_CHAR, 1, 1, MPI_COMM_WORLD);
>     });
>     std::thread t2([&]() {
>       // Generate data ...
>       MPI_Send(data, 1, MPI_CHAR, 2, 2, MPI_COMM_WORLD);
>     });
>     t1.join(); t2.join();
>   } else {
>     MPI_Recv(data, 1, MPI_CHAR, 0, rank, MPI_COMM_WORLD,
> MPI_STATUS_IGNORE);
>   }
>
>
>    - Multiple threads from a process A want each to send a message to the
>    same thread of another process B using the *same communicator and*
>    *tag*. Code example:
>
>   // One process with N threads to one process with one thread (same tags)
>   if (rank == 0) {
>     std::thread t1([&]() {
>       // Generate data ...
>       MPI_Send(data, 1, MPI_CHAR, 1, 0, MPI_COMM_WORLD);
>     });
>     std::thread t2([&]() {
>       // Generate data ...
>       MPI_Send(data, 1, MPI_CHAR, 1, 0, MPI_COMM_WORLD);
>     });
>     t1.join(); t2.join();
>   } else if (rank == 1) {
>     MPI_Recv(data, 1, MPI_CHAR, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
>     // Process data ...
>     MPI_Recv(data, 1, MPI_CHAR, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
>     // Process data ...
>   }
>
>
>    - One thread from a process A wants to send a message to one of
>    multiple threads from a process B using the *same communicator and tag*.
>    Code example:
>
>   // One process with one thread to one process with N threads (same tags)
>   if (rank == 0) {
>     MPI_Request requests[2];
>     // Generate data ...
>     MPI_Isend(data, 1, MPI_CHAR, 1, 0, MPI_COMM_WORLD, &requests[0]);
>     // Generate data ...
>     MPI_Isend(data, 1, MPI_CHAR, 1, 0, MPI_COMM_WORLD, &requests[1]);
>
>     MPI_Waitall(2, requests, MPI_STATUSES_IGNORE);
>   } else if (rank == 1) {
>     std::thread t1([&]() {
>       MPI_Recv(data, 1, MPI_CHAR, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
>       // Process data ...
>     });
>     std::thread t2([&]() {
>       MPI_Recv(data, 1, MPI_CHAR, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
>       // Process data ...
>     });
>     t1.join(); t2.join();
>   }
>
>
>    - Two threads from a process A want each to send a message to two
>    threads from a process B using different tags for each pair of threads
>    (e.g. the pair A.1/B.1 uses a different tag from pair A.2/B.2). Code
>    example:
>
>   // One process with N threads to one process with N thread (different
> tags)
>   if (rank == 0) {
>     std::thread t1([&]() {
>       // Generate data ...
>       MPI_Send(data, 1, MPI_CHAR, 1, 1, MPI_COMM_WORLD);
>     });
>     std::thread t2([&]() {
>       // Generate data ...
>       MPI_Send(data, 1, MPI_CHAR, 1, 2, MPI_COMM_WORLD);
>     });
>     t1.join(); t2.join();
>   } else if (rank == 1) {
>     std::thread t1([&]() {
>       MPI_Recv(data, 1, MPI_CHAR, 0, 1, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
>       // Process data ...
>     });
>     std::thread t2([&]() {
>       MPI_Recv(data, 1, MPI_CHAR, 0, 2, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
>       // Process data ...
>     });
>     t1.join(); t2.join();
>   }
>
>
>
> *Question 2*. Is there a problem when mixing blocking and non-blocking
> calls on opposite sides of a message? Even in the multithreaded scenarios
> described previously? (e.g. matching a *MPI_Isend* with a *MPI_Recv*, and
> vice-versa.)
>
>
>
> Thank you.
>
>
>
> Regards,
>
> Guilherme Valarini
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20210127/5254928d/attachment-0001.html>


More information about the discuss mailing list