[mpich-discuss] Failed to allocate memory for an unexpected message

Luiz Carlos da Costa Junior lcjunior at ufrj.br
Wed Mar 12 12:40:30 CDT 2014


Dear Kenneth,

Thanks for your quick reply.
I tested your suggestion and, unfortunately, this approach didn't work.

Question: when I call MPI_IPROBE it accounts also for the messages that
were already received asynchronously?

Is there any way to know, for my list of mpi_requests (from my
MPI_IRECV's), which ones are "opened" and which ones have messages?

Regards,


On 11 March 2014 17:00, Kenneth Raffenetti <raffenet at mcs.anl.gov> wrote:

> You could use MPI_Probe/MPI_Iprobe and pass in your "data" tag to test for
> any more pending messages.
>
> Ken
>
>
> On 03/11/2014 02:50 PM, Luiz Carlos da Costa Junior wrote:
>
>> Dear all,
>>
>> I am all set with your suggestions and my program is working quite well
>> without any "unexpected message" since then, thanks again.
>>
>> However now I am facing a small problem I describe next.
>>
>> My receiver process receives actually 2 types of messages (2 different
>> tags):
>>
>>   * the basic tag means that the message is a "data message" ("data"
>>
>>     tag) that should be processed.
>>   * the second one means that the worker process is done and it will
>>
>>     send no more messages ("end_of_processing" tag).
>>
>> Once all worker processes send their end_of_processing tag, the receiver
>> process finishes its execution.
>>
>> The problem I noticed is that some of the last messages sent by the
>> worker processes were not being processed. I think the problem is
>> related with the logic I am using with MPI_WAITANY in the receiver
>> process. I am simply counting the number of end_of_processing messages
>> received and if it reaches the number of worker processes, I finish the
>> execution without checking if there are more messages to be received at
>> the MPI_WAITANY queue.
>>
>> As the order that messages arrive is not relevant for MPI_WAITANY I
>> think that my logic forgets some of the messages at the end of the
>> queue. Is this right?
>>
>> Is there any way to check if there is any pending request to be processed?
>>
>> Best regards,
>> Luiz
>>
>>
>> On 16 January 2014 16:57, "Antonio J. Peña" <apenya at mcs.anl.gov
>> <mailto:apenya at mcs.anl.gov>> wrote:
>>
>>
>>     A profiling of both of your codes would help to understand where the
>>     time is spent and the difference between them in terms of performance.
>>
>>        Antonio
>>
>>
>>
>>     On 01/16/2014 12:47 PM, Luiz Carlos da Costa Junior wrote:
>>
>>>     Yes, I am comparing original x new implementation.
>>>
>>>     The original implementation is as follows.
>>>
>>>     c-----------------------------------------------------------
>>> ------------
>>>           subroutine my_receiver_original
>>>     c ------------------------------------------------------------------
>>>           (...)
>>>
>>>     c     Local
>>>     c     -----
>>>           integer*4 m_stat(MPI_STATUS_SIZE)
>>>           character card*(zbuf)      ! buffer for messages received
>>>
>>>           do while( keep_receiving )
>>>             call MPI_RECV(card, zbuf, MPI_CHARACTER,
>>>          .  MPI_ANY_SOURCE, M_RECCSV, MY_COMM,
>>>          .  m_stat, m_ierr )
>>>
>>>     c       Process message: disk IO
>>>     c ---------------
>>>             <DO SOMETHING>
>>>             if( SOMETHING_ELSE ) then
>>>     keep_receiving = .false.
>>>             end if
>>>           end do
>>>
>>>           (...)
>>>
>>>           return
>>>           end
>>>
>>>     Regards,
>>>     Luiz
>>>
>>>     On 16 January 2014 16:19, Balaji, Pavan <balaji at mcs.anl.gov
>>>     <mailto:balaji at mcs.anl.gov>> wrote:
>>>
>>>
>>>         On Jan 16, 2014, at 12:16 PM, Luiz Carlos da Costa Junior
>>>         <lcjunior at ufrj.br <mailto:lcjunior at ufrj.br>> wrote:
>>>         > No, these failures don't occur all the time. I have a
>>>         successful run (with my original implementation) which I am
>>>         using as the base case for comparison.
>>>
>>>         What are the two cases you are comparing?  Original
>>>         implementation vs. new implementation?  What's the original
>>>         implementation?
>>>
>>>           -- Pavan
>>>
>>>         _______________________________________________
>>>         discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org
>>> >
>>>
>>>         To manage subscription options or unsubscribe:
>>>         https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>>
>>>
>>>
>>>     _______________________________________________
>>>     discuss mailing listdiscuss at mpich.org  <mailto:discuss at mpich.org>
>>>
>>>     To manage subscription options or unsubscribe:
>>>     https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>
>>
>>     --
>>     Antonio J. Peña
>>     Postdoctoral Appointee
>>     Mathematics and Computer Science Division
>>     Argonne National Laboratory
>>     9700 South Cass Avenue, Bldg. 240, Of. 3148
>>     Argonne, IL 60439-4847
>>     apenya at mcs.anl.gov  <mailto:apenya at mcs.anl.gov>
>>     www.mcs.anl.gov/~apenya  <http://www.mcs.anl.gov/~apenya>
>>
>>
>>     _______________________________________________
>>     discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
>>
>>     To manage subscription options or unsubscribe:
>>     https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>>
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>  _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140312/4942b478/attachment.html>


More information about the discuss mailing list