[mpich-discuss] In-order messages

Thakur, Rajeev thakur at anl.gov
Mon Mar 23 15:09:15 CDT 2020


The MPI implementation has to ensure that the probe will not return with message B first. It could use sequence numbers internally, for example.

From: "Hudson, Stephen Tobias P" <shudson at anl.gov>
Date: Monday, March 23, 2020 at 3:04 PM
To: "Thakur, Rajeev" <thakur at anl.gov>, "Larson, Jeffrey M." <jmlarson at anl.gov>, "discuss at mpich.org" <discuss at mpich.org>
Cc: "Navarro, John-Luke Nicolas" <jnavarro at anl.gov>
Subject: Re: [mpich-discuss] In-order messages

Is it undefined where the matching happens?

So the question is:

Process 1 sends message A and then message B, both to process 0. message B travels faster through the network, and is received first by process 0. At this point an iprobe on process 0 occurs with MPI_TAG_ANY. My thought is that it finds messageB (which happens to say - ALL FINISHED). A follow up MPI_recv will say all is finished. Is this possible?

I've assumed here that the probe is local and so the matching with message A has not yet happened.

The given snippet tells me that a probe will return if there is a matching send, but in this case there is two matching sends.
________________________________
From: Thakur, Rajeev <thakur at anl.gov>
Sent: Monday, March 23, 2020 2:41 PM
To: Hudson, Stephen Tobias P <shudson at anl.gov>; Larson, Jeffrey M. <jmlarson at anl.gov>; discuss at mpich.org <discuss at mpich.org>
Cc: Navarro, John-Luke Nicolas <jnavarro at anl.gov>
Subject: Re: [mpich-discuss] In-order messages


Pending means not already matched with another message.



This is what the standard says about probe:



“The MPI implementation of MPI_PROBE and MPI_IPROBE needs to guarantee progress: if a call to MPI_PROBE has been issued by a process, and a send that matches the probe has been initiated by some process, then the call to MPI_PROBE will return, unless the message is received by another concurrent receive operation (that is executed by another thread at the probing process). Similarly, if a process busy waits with MPI_IPROBE and a matching message has been issued, then the call to MPI_IPROBE will eventually return flag = true unless the message is received by another concurrent receive operation or matched by a concurrent matched probe.”



Eager protocol or any other protocol have to support the semantics specified by the standard.







From: "Hudson, Stephen Tobias P" <shudson at anl.gov>
Date: Monday, March 23, 2020 at 2:27 PM
To: "Thakur, Rajeev" <thakur at anl.gov>, "Larson, Jeffrey M." <jmlarson at anl.gov>, "discuss at mpich.org" <discuss at mpich.org>
Cc: "Navarro, John-Luke Nicolas" <jnavarro at anl.gov>
Subject: Re: [mpich-discuss] In-order messages



The point where I am still a bit unclear is whether "pending" refers to something at the sender's end or in some buffer at the receiver's end. Perhaps an answer to this will help me. Is mpi_probe/iprobe a local operation or does it actually communicate with the sender? Also, what effect does the 'eager' protocal have on this?

________________________________

From: Thakur, Rajeev <thakur at anl.gov>
Sent: Thursday, March 12, 2020 6:30 PM
To: Hudson, Stephen Tobias P <shudson at anl.gov>; Larson, Jeffrey M. <jmlarson at anl.gov>; discuss at mpich.org <discuss at mpich.org>
Cc: Navarro, John-Luke Nicolas <jnavarro at anl.gov>
Subject: Re: [mpich-discuss] In-order messages



The paragraph in Sec 3.5 (reproduced below) talks about the case when two sends match one receive and another case where two receives match one send.  In the first case, the receive can get data only from the first send. In the second case, only the first receive can get the data from the send. That is what is meant by non-overtaking.



“Messages are non-overtaking: If a sender sends two messages in succession to the same destination, and both match the same receive, then this operation cannot receive the second message if the first one is still pending. If a receiver posts two receives in succession, and both match the same message, then the second receive operation cannot be satisfied by this message, if the first one is still pending. This requirement facilitates matching of sends to receives.”







From: "Hudson, Stephen Tobias P" <shudson at anl.gov>
Date: Thursday, March 12, 2020 at 4:09 PM
To: "Thakur, Rajeev" <thakur at anl.gov>, "Larson, Jeffrey M." <jmlarson at anl.gov>, "discuss at mpich.org" <discuss at mpich.org>
Cc: "Navarro, John-Luke Nicolas" <jnavarro at anl.gov>
Subject: Re: [mpich-discuss] In-order messages



Paraphrasing the discussion so far, my understanding is:



Note: I am only thinking of the case where both messages have the same tag and go to the same destination processor.



Section 3.5 of the MPI standard mentions: "Messages are non-overtaking: "



I think this must mean that once the messages are in the receive buffer one cannot overtake the one infront. It is trying to make the point that your MPI_Recv cannot reach over message 1 and say I want message 2 (the buffer is FIFO). But this is only in the receive buffer.



However, the order the messages are received into the buffer (from the network) cannot be guaranteed to be the same as the order they were sent.



Is this correct?



Many Thanks,



Steve



________________________________

From: Thakur, Rajeev <thakur at anl.gov>
Sent: Thursday, March 12, 2020 1:18 PM
To: Larson, Jeffrey M. <jmlarson at anl.gov>; discuss at mpich.org <discuss at mpich.org>
Cc: Navarro, John-Luke Nicolas <jnavarro at anl.gov>; Hudson, Stephen Tobias P <shudson at anl.gov>
Subject: Re: [mpich-discuss] In-order messages



Matching specifies which buffer the incoming message goes into. Two sends issued in order with the same tag and destination will match two receives issued in order with the same tag or MPI_ANY_TAG at the destination. The two sends could take different paths through the network, one more congested than the other, and hence reach the destination out of order. But they will get placed in the intended buffers.



Rajeev





From: "Larson, Jeffrey M." <jmlarson at anl.gov>
Date: Thursday, March 12, 2020 at 1:12 PM
To: "Thakur, Rajeev" <thakur at anl.gov>, "discuss at mpich.org" <discuss at mpich.org>
Cc: "Navarro, John-Luke Nicolas" <jnavarro at anl.gov>, "Hudson, Stephen Tobias P" <shudson at anl.gov>
Subject: Re: [mpich-discuss] In-order messages



On 3/12/20 1:04 PM, Thakur, Rajeev wrote:

The standard specifies the order in which messages match, not complete. They can complete in any order.



Rajeev





From: "Larson, Jeffrey M. via discuss" <discuss at mpich.org><mailto:discuss at mpich.org>
Reply-To: "discuss at mpich.org"<mailto:discuss at mpich.org> <discuss at mpich.org><mailto:discuss at mpich.org>
Date: Thursday, March 12, 2020 at 12:46 PM
To: "discuss at mpich.org"<mailto:discuss at mpich.org> <discuss at mpich.org><mailto:discuss at mpich.org>
Cc: "Larson, Jeffrey M." <jmlarson at anl.gov><mailto:jmlarson at anl.gov>, "Navarro, John-Luke Nicolas" <jnavarro at anl.gov><mailto:jnavarro at anl.gov>, "Hudson, Stephen Tobias P" <shudson at anl.gov><mailto:shudson at anl.gov>
Subject: [mpich-discuss] In-order messages



Hello MPICH friends,



Consider the simple two-rank MPI scenario:

  *   Rank 1 is doing calculations and giving chunks of data to rank 0 using nonblocking sends and tag=0.
  *   When rank 1 is finished, it will send it's last data (or no data) with tag=1.
  *   Rank 0 is using probes to see when data is ready to be received. It receives with any tag, knowing when to stop receiving, or give new data when a tag=0 is received.

Is it possible that rank 0 receives a tag=1 message when there are outstanding tag=0 messages?



Looking at section 3.5 of the MPI standard lets me know that



"Messages are non-overtaking: If a sender sends two messages in succession to the same destination, and both match the same receive, then this operation cannot receive the second message if the first one is still pending."



But I'm not sure if this applies to the above case. Is anytag "the same receive"?

If rank 1 puts data in its buffer, doesn't the network have to be used to communicate that to the buffer of rank 0?

While rank 1 is putting data into its buffer in order, is it possible that a tiny tag=1 message is registered in the rank 0 buffer before a massive tag=0 message?



Thank you for your help,

Jeff

Hi Rajeev,

What is the difference between matching and completing?

Does this mean that the quote from the standard that I gave means the messages between two fixed ranks could overtake each other? That is, they "match" but don't "complete" in the same order?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20200323/1938642b/attachment-0001.html>


More information about the discuss mailing list