[mpich-discuss] Failed to allocate memory for an unexpected message

Kenneth Raffenetti raffenet at mcs.anl.gov
Tue Mar 11 15:00:50 CDT 2014


You could use MPI_Probe/MPI_Iprobe and pass in your "data" tag to test 
for any more pending messages.

Ken

On 03/11/2014 02:50 PM, Luiz Carlos da Costa Junior wrote:
> Dear all,
>
> I am all set with your suggestions and my program is working quite well
> without any "unexpected message" since then, thanks again.
>
> However now I am facing a small problem I describe next.
>
> My receiver process receives actually 2 types of messages (2 different
> tags):
>
>   * the basic tag means that the message is a "data message" ("data"
>     tag) that should be processed.
>   * the second one means that the worker process is done and it will
>     send no more messages ("end_of_processing" tag).
>
> Once all worker processes send their end_of_processing tag, the receiver
> process finishes its execution.
>
> The problem I noticed is that some of the last messages sent by the
> worker processes were not being processed. I think the problem is
> related with the logic I am using with MPI_WAITANY in the receiver
> process. I am simply counting the number of end_of_processing messages
> received and if it reaches the number of worker processes, I finish the
> execution without checking if there are more messages to be received at
> the MPI_WAITANY queue.
>
> As the order that messages arrive is not relevant for MPI_WAITANY I
> think that my logic forgets some of the messages at the end of the
> queue. Is this right?
>
> Is there any way to check if there is any pending request to be processed?
>
> Best regards,
> Luiz
>
>
> On 16 January 2014 16:57, "Antonio J. Peña" <apenya at mcs.anl.gov
> <mailto:apenya at mcs.anl.gov>> wrote:
>
>
>     A profiling of both of your codes would help to understand where the
>     time is spent and the difference between them in terms of performance.
>
>        Antonio
>
>
>
>     On 01/16/2014 12:47 PM, Luiz Carlos da Costa Junior wrote:
>>     Yes, I am comparing original x new implementation.
>>
>>     The original implementation is as follows.
>>
>>     c-----------------------------------------------------------------------
>>           subroutine my_receiver_original
>>     c ------------------------------------------------------------------
>>           (...)
>>
>>     c     Local
>>     c     -----
>>           integer*4 m_stat(MPI_STATUS_SIZE)
>>           character card*(zbuf)      ! buffer for messages received
>>
>>           do while( keep_receiving )
>>             call MPI_RECV(card, zbuf, MPI_CHARACTER,
>>          .  MPI_ANY_SOURCE, M_RECCSV, MY_COMM,
>>          .  m_stat, m_ierr )
>>
>>     c       Process message: disk IO
>>     c ---------------
>>             <DO SOMETHING>
>>             if( SOMETHING_ELSE ) then
>>     keep_receiving = .false.
>>             end if
>>           end do
>>
>>           (...)
>>
>>           return
>>           end
>>
>>     Regards,
>>     Luiz
>>
>>     On 16 January 2014 16:19, Balaji, Pavan <balaji at mcs.anl.gov
>>     <mailto:balaji at mcs.anl.gov>> wrote:
>>
>>
>>         On Jan 16, 2014, at 12:16 PM, Luiz Carlos da Costa Junior
>>         <lcjunior at ufrj.br <mailto:lcjunior at ufrj.br>> wrote:
>>         > No, these failures don't occur all the time. I have a
>>         successful run (with my original implementation) which I am
>>         using as the base case for comparison.
>>
>>         What are the two cases you are comparing?  Original
>>         implementation vs. new implementation?  What’s the original
>>         implementation?
>>
>>           — Pavan
>>
>>         _______________________________________________
>>         discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
>>         To manage subscription options or unsubscribe:
>>         https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>>
>>
>>     _______________________________________________
>>     discuss mailing listdiscuss at mpich.org  <mailto:discuss at mpich.org>
>>     To manage subscription options or unsubscribe:
>>     https://lists.mpich.org/mailman/listinfo/discuss
>
>
>     --
>     Antonio J. Peña
>     Postdoctoral Appointee
>     Mathematics and Computer Science Division
>     Argonne National Laboratory
>     9700 South Cass Avenue, Bldg. 240, Of. 3148
>     Argonne, IL 60439-4847
>     apenya at mcs.anl.gov  <mailto:apenya at mcs.anl.gov>
>     www.mcs.anl.gov/~apenya  <http://www.mcs.anl.gov/~apenya>
>
>
>     _______________________________________________
>     discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
>     To manage subscription options or unsubscribe:
>     https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>



More information about the discuss mailing list