[mpich-discuss] Isend and Recv

Junchao Zhang jczhang at mcs.anl.gov
Thu Nov 6 10:55:38 CST 2014


My understanding is that:  In eager mode, sender does not need a Ack
message from receiver (to know recv buf is ready), but in rendezvous mode,
it does.
In your case, if it is in rendezvous mode, Isend() issues the request. But
since it is a nonblocking call, before sender gets the Ack message, it goes
into sleep. The asynchronous progress engine does not progress until
MPI_Wait() is called. At that time, the Ack message is got, and message
passing happens. Therefore, MPI_Recv() finishes after MPI_Wait().

--Junchao Zhang

On Thu, Nov 6, 2014 at 10:33 AM, Zhen Wang <toddwz at gmail.com> wrote:

> Junchao,
>
> Thanks for your reply. It works! I digged deeper into the eager and
> rendezvous modes, and got confused..
>
> The docs I read were
> https://computing.llnl.gov/tutorials/mpi_performance/#Protocols and
> http://www-01.ibm.com/support/knowledgecenter/SSFK3V_1.3.0/com.ibm.cluster.pe.v1r3.pe400.doc/am106_eagermess.htm
> .
>
> My understanding is:
>
> In eager mode, the receiver allocates a buffer and an additional copy is
> required to copy the data from the buffer to user allocated space. This is
> good for small messages, not for large ones.
>
> In rendezvous mode, the user allocated space on the receiver must be ready
> to receive the data. But if the space is ready, the MPI_Recv() should be
> completed almost immediately (there's a handshake between the send and
> receiver, and a copy operation).
>
> If I was right, in my code example, MPI_Recv() should finish after
> MPI_Isend() no matter in eager or rendezvous modes. The memory has been
> allocated, and the handshake should take like no time. Am I missing
> something?
>
> Thanks a lot.
>
>
> Best regards,
> Zhen
>
> On Wed, Nov 5, 2014 at 2:28 PM, Junchao Zhang <jczhang at mcs.anl.gov> wrote:
>
>> I think it is because MPICH just crosses over the eager to rendezvous
>> mode threshold, when  n goes from 9999 to 99999.  OpenMPI certainly uses a
>> different threshold than MPICH.
>> When you install MPICH, a utility program mpivars is also installed. Type
>> 'mpivars | grep EAGER',  you will get default values for various eager
>> thresholds.
>>
>> In your case, export MPIR_CVAR_CH3_EAGER_MAX_MSG_SIZE=5000000 and you
>> will get the same result as OpenMPI.
>>
>> --Junchao Zhang
>>
>> On Wed, Nov 5, 2014 at 11:20 AM, Zhen Wang <toddwz at gmail.com> wrote:
>>
>>> Hi MPIers,
>>>
>>> I have some questions regarding MPI_Isend() and MPI_Recv(). MPICH-3.1.3
>>> is used to compile and run the attached code on Red Hat Enterprise Linux
>>> 6.3 (a shared memory machine). While n = 9999, the MPI_Recv() finishes
>>> immediately after MPI_Isend(): (This is what I understand and expect)
>>>
>>> MPI 1: Recv started at 09:53:53.
>>> MPI 0: Isend started at 09:53:53.
>>> MPI 1: Recv finished at 09:53:53.
>>> MPI 0: Isend finished at 09:53:58.
>>>
>>> When n = 99999, I get the following. The MPI_Recv() finishes after
>>> MPI_Wait():
>>>
>>> MPI 1: Recv started at 09:47:56.
>>> MPI 0: Isend started at 09:47:56.
>>> MPI 0: Isend finished at 09:48:01.
>>> MPI 1: Recv finished at 09:48:01.
>>>
>>> But with OpenMPI 1.8 and n = 99999, MPI_Recv() finishes immediately
>>> after MPI_Isend():
>>>
>>> MPI 0: Isend started at 09:55:28.
>>> MPI 1: Recv started at 09:55:28.
>>> MPI 1: Recv finished at 09:55:28.
>>> MPI 0: Isend finished at 09:55:33.
>>>
>>> Am I misunderstanding something here? In case the attached code is
>>> dropped, the code is included. Thanks in advance.
>>>
>>>
>>> #include "mpi.h"
>>> #include <unistd.h>
>>> #include <stdio.h>
>>> #include "vector"
>>> #include <time.h>
>>>
>>> int main(int argc, char* argv[])
>>> {
>>>   MPI_Init(&argc, &argv);
>>>
>>>   int rank;
>>>   MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>>
>>>   int n = 9999;
>>>   std::vector<int> vec(n);
>>>   MPI_Request mpiRequest;
>>>   MPI_Status mpiStatus;
>>>   char tt[9] = {0};
>>>
>>>   MPI_Barrier(MPI_COMM_WORLD);
>>>
>>>   if (rank == 0)
>>>   {
>>>     MPI_Isend(&vec[0], n, MPI_INT, 1, 0, MPI_COMM_WORLD, &mpiRequest);
>>>     time_t t = time(0);
>>>     strftime(tt, 9, "%H:%M:%S", localtime(&t));
>>>     printf("MPI %d: Isend started at %s.\n", rank, tt);
>>>
>>>     //int done = 0;
>>>     //while (done == 0)
>>>     //{
>>>     //  MPI_Test(&mpiRequest, &done, &mpiStatus);
>>>     //}
>>>     sleep(5);
>>>     MPI_Wait(&mpiRequest, &mpiStatus);
>>>
>>>     t = time(0);
>>>     strftime(tt, 9, "%H:%M:%S", localtime(&t));
>>>     printf("MPI %d: Isend finished at %s.\n", rank, tt);
>>>   }
>>>   else
>>>   {
>>>     time_t t = time(0);
>>>     strftime(tt, 9, "%H:%M:%S", localtime(&t));
>>>     printf("MPI %d: Recv started at %s.\n", rank, tt);
>>>
>>>     MPI_Recv(&vec[0], n, MPI_INT, 0, 0, MPI_COMM_WORLD, &mpiStatus);
>>>
>>>     t = time(0);
>>>     strftime(tt, 9, "%H:%M:%S", localtime(&t));
>>>     printf("MPI %d: Recv finished at %s.\n", rank, tt);
>>>   }
>>>
>>>   MPI_Finalize();
>>>
>>>   return 0;
>>> }
>>>
>>>
>>>
>>> Best regards,
>>> Zhen
>>>
>>> _______________________________________________
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20141106/c019fd3f/attachment.html>


More information about the discuss mailing list