[mpich-discuss] Failed to allocate memory for an unexpected message

Wed Oct 23 14:42:15 CDT 2013

Hi Luiz,

Your error trace indicates that the receiver went out of memory due to a 
too large amount (261,895) of eager unexpected messages received, i.e., 
small messages received without a matching receive operation. Whenever 
this happens, the receiver allocates a temporary buffer to hold the 
received message. This exhausted the available memory in the computer 
where the receiver was executing.

To avoid this, try to pre-post receives before messages arrive. Indeed, this 
is far more efficient. Maybe you could do an MPI_IRecv per worker in your 
writer process, and process them after an MPI_Waitany. You may also 
consider having multiple writer processes if your use case permits and the 
volume of received messages is too high to be processed by a single 
writer.

  Antonio

On Wednesday, October 23, 2013 05:27:27 PM Luiz Carlos da Costa Junior 
wrote:

Hi,

I am getting the following error when running my parallel application:

MPI_Recv(186)......................: MPI_Recv(buf=0x125bd840, count=2060, 
MPI_CHARACTER, src=24, tag=94, comm=0x84000002, status=0x125fcff0) 
failed 
MPIDI_CH3I_Progress(402)...........:  
MPID_nem_mpich2_blocking_recv(905).:  
MPID_nem_tcp_connpoll(1838)........:  
state_commrdy_handler(1676)........:  
MPID_nem_tcp_recv_handler(1564)....:  
MPID_nem_handle_pkt(636)...........:  
MPIDI_CH3_PktHandler_EagerSend(606): Failed to allocate memory for an 
unexpected message. 261895 unexpected messages queued. 
Fatal error in MPI_Send: Other MPI error, error stack:
MPI_Send(173)..............: MPI_Send(buf=0x765d2e60, count=2060, 
MPI_CHARACTER, dest=0, tag=94, comm=0x84000004) failed 
MPID_nem_tcp_connpoll(1826): Communication error with rank 1: 
Connection reset by peer 

I went to MPICH's FAQ 
(http://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_Wh
y_am_I_getting_so_many_unexpected_messages.3F[1]). 
It says that most likely the receiver process can't cope to process the high 
number of messages it is receiving.

In my application, the worker processes perform a very large number of 
small computations and, after some computation is complete, they sent 
the data to a special "writer" process that is responsible to write the 
output to disk. 
This scheme use to work in a very reasonable fashion, until we faced some 
new data with larger parameters that caused the problem above.

Even though we can redesign the application, for example, by creating a 
pool of writer process we still have only one hard disk, so the bottleneck 
would not be solved. So, this doesn't seem to be a good approach. 

As far as I understood, MPICH saves the content of every MPI_SEND in a 
internal buffer (I don't know where the buffer in located, sender or 
receiver?) to allow asynchronous sender's computation while the 
messages are being received. 
The problem is that buffer has been exhausted due some resource 
limitation.

It is very interesting to have a buffer but if the buffer in the writer process 
is close to its limit the workers processes should stop and wait until it 
frees some space to restart sending new data to be written to disk. 

Is it possible to check this buffer in MPICH? Or is it possible to check the 
number of messages to be received?
Can anyone suggest a better (easy to implement) solution?

Thanks in advance.

Regards,

Luiz

--------
[1] 
http://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_Why
_am_I_getting_so_many_unexpected_messages.3F
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20131023/ff1485b7/attachment.html>