[mpich-discuss] Failed to allocate memory for an unexpected message

Luiz Carlos da Costa Junior lcjunior at ufrj.br
Tue Mar 11 14:50:03 CDT 2014


Dear all,

I am all set with your suggestions and my program is working quite well
without any "unexpected message" since then, thanks again.

However now I am facing a small problem I describe next.

My receiver process receives actually 2 types of messages (2 different
tags):

   - the basic tag means that the message is a "data message" ("data" tag)
   that should be processed.
   - the second one means that the worker process is done and it will send
   no more messages ("end_of_processing" tag).

Once all worker processes send their end_of_processing tag, the receiver
process finishes its execution.

The problem I noticed is that some of the last messages sent by the worker
processes were not being processed. I think the problem is related with the
logic I am using with MPI_WAITANY in the receiver process. I am simply
counting the number of end_of_processing messages received and if it
reaches the number of worker processes, I finish the execution without
checking if there are more messages to be received at the MPI_WAITANY queue.

As the order that messages arrive is not relevant for MPI_WAITANY I think
that my logic forgets some of the messages at the end of the queue. Is this
right?

Is there any way to check if there is any pending request to be processed?

Best regards,
Luiz


On 16 January 2014 16:57, "Antonio J. Peña" <apenya at mcs.anl.gov> wrote:

>
> A profiling of both of your codes would help to understand where the time
> is spent and the difference between them in terms of performance.
>
>   Antonio
>
>
>
> On 01/16/2014 12:47 PM, Luiz Carlos da Costa Junior wrote:
>
> Yes, I am comparing original x new implementation.
>
>  The original implementation is as follows.
>
>  c-----------------------------------------------------------------------
>        subroutine my_receiver_original
> c     ------------------------------------------------------------------
>       (...)
>
>  c     Local
> c     -----
>       integer*4 m_stat(MPI_STATUS_SIZE)
>       character card*(zbuf)      ! buffer for messages received
>
>        do while( keep_receiving )
>         call MPI_RECV(card, zbuf, MPI_CHARACTER,
>      .                MPI_ANY_SOURCE, M_RECCSV, MY_COMM,
>      .                m_stat, m_ierr )
>
>  c       Process message: disk IO
> c       ---------------
>         <DO SOMETHING>
>         if( SOMETHING_ELSE ) then
>           keep_receiving = .false.
>         end if
>       end do
>
>        (...)
>
>        return
>       end
>
> Regards,
> Luiz
>
> On 16 January 2014 16:19, Balaji, Pavan <balaji at mcs.anl.gov> wrote:
>
>>
>> On Jan 16, 2014, at 12:16 PM, Luiz Carlos da Costa Junior <
>> lcjunior at ufrj.br> wrote:
>> > No, these failures don't occur all the time. I have a successful run
>> (with my original implementation) which I am using as the base case for
>> comparison.
>>
>>  What are the two cases you are comparing?  Original implementation vs.
>> new implementation?  What's the original implementation?
>>
>>   -- Pavan
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> --
> Antonio J. Peña
> Postdoctoral Appointee
> Mathematics and Computer Science Division
> Argonne National Laboratory
> 9700 South Cass Avenue, Bldg. 240, Of. 3148
> Argonne, IL 60439-4847apenya at mcs.anl.govwww.mcs.anl.gov/~apenya
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140311/ecf306d4/attachment.html>


More information about the discuss mailing list