<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body data-blackberry-caret-color="#00a8df" style="background-color: rgb(255, 255, 255); line-height: initial;">
<div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);">
Ok Antonio I know what you're talking about‎.</div>
<div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);">
I will try this and I hope to solve it!</div>
<div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);">
Thanks anyway to all of you :)</div>
<div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);">
Sent from my BlackBerry 10 smartphone.</div>
<table width="100%" style="background-color:white;border-spacing:0px;">
<tbody>
<tr>
<td colspan="2" style="font-size: initial; text-align: initial; background-color: rgb(255, 255, 255);">
<div id="_persistentHeader" style="border-style: solid none none; border-top-color: rgb(181, 196, 223); border-top-width: 1pt; padding: 3pt 0in 0in; font-family: Tahoma, 'BB Alpha Sans', 'Slate Pro'; font-size: 10pt;">
<div><b>From: </b>discuss-request@mpich.org</div>
<div><b>Sent: </b>Wednesday, 23 October 2013 23:22</div>
<div><b>To: </b>discuss@mpich.org</div>
<div><b>Reply To: </b>discuss@mpich.org</div>
<div><b>Subject: </b>discuss Digest, Vol 12, Issue 15</div>
</div>
</td>
</tr>
</tbody>
</table>
<div style="border-style: solid none none; border-top-color: rgb(186, 188, 209); border-top-width: 1pt; font-size: initial; text-align: initial; background-color: rgb(255, 255, 255);">
</div>
<br>
<div class="BodyFragment">
<div class="PlainText">Send discuss mailing list submissions to<br>
discuss@mpich.org<br>
<br>
To subscribe or unsubscribe via the World Wide Web, visit<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
or, via email, send a message with subject or body 'help' to<br>
discuss-request@mpich.org<br>
<br>
You can reach the person managing the list at<br>
discuss-owner@mpich.org<br>
<br>
When replying, please edit your Subject line so it is more specific<br>
than "Re: Contents of discuss digest..."<br>
<br>
<br>
Today's Topics:<br>
<br>
1. Re: running parralel job issue(alexandra) (Antonio J. Pe?a)<br>
<br>
<br>
----------------------------------------------------------------------<br>
<br>
Message: 1<br>
Date: Wed, 23 Oct 2013 15:21:57 -0500<br>
From: Antonio J. Pe?a <apenya@mcs.anl.gov><br>
To: discuss@mpich.org<br>
Subject: Re: [mpich-discuss] running parralel job issue(alexandra)<br>
Message-ID: <2206013.AaZ6q0B8ti@localhost.localdomain><br>
Content-Type: text/plain; charset="utf-8"<br>
<br>
<br>
The address you got is nothing but a loopback address. I mean, that's not <br>
the real IP address of your network interfaces, and is only used for self <br>
communications through the sockets interface. In Linux, you can <br>
determine the IP address of your network interface with something like:<br>
<br>
/sbin/ifconfig eth0<br>
<br>
(you may need to replace eth0 with the identifier of your network interface, <br>
but in most cases this should work).<br>
<br>
You should assign different IP addresses to all your computers in the same <br>
network in order to be able to perform any communication among them. <br>
You most likely will want to assign private addresses, such as 192.168.0.1, <br>
192.168.0.2, etc. You can easily find how to do this by googling a little bit.<br>
<br>
Antonio<br>
<br>
<br>
On Wednesday, October 23, 2013 11:10:48 PM Alexandra Betouni wrote:<br>
<br>
<br>
?Well yes, the job runs on host2 locally, but parallel execution does the <br>
same thing like on host1. <br>
Someone here said that if all the computers have the same ip direction it <br>
won't work..<br>
Well, every node has ?127.0.1.1 As ip, and all of them had same host name <br>
till I changed the two of them. Hydra is the default launcher. <br>
I also forgot to mencione that ping host2 and same, ping host1 works <br>
fine...<br>
<br>
<br>
Sent from my BlackBerry 10 smartphone.<br>
*From: *?discuss-request@mpich.org<br>
*Sent: *?Wednesday, 23 October 2013 22:42<br>
*To: *discuss@mpich.org<br>
*Reply To: *discuss@mpich.org<br>
*Subject: *discuss Digest, Vol 12, Issue 13<br>
<br>
<br>
<br>
Send discuss mailing list submissions to discuss@mpich.org<br>
<br>
To subscribe or unsubscribe via the World Wide Web, visit <br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss[1">https://lists.mpich.org/mailman/listinfo/discuss[1</a>]<br>
_https://lists.mpich.org/mailman/listinfo/discuss_<br>
_https://lists.mpich.org/mailman/listinfo/discuss_<br>
_https://lists.mpich.org/mailman/listinfo/discuss_<br>
_https://lists.mpich.org/mailman/listinfo/discuss_<br>
_https://lists.mpich.org/mailman/listinfo/discuss_<br>
<a href="http://lists.mpich.org/pipermail/discuss/attachments/20131023/fdf9d820/at">http://lists.mpich.org/pipermail/discuss/attachments/20131023/fdf9d820/at</a><br>
tachment-0001.html[2]><br>
<br>
------------------------------<br>
<br>
Message: 3Date: Wed, 23 Oct 2013 17:27:27 -0200From: Luiz Carlos da <br>
Costa Junior <lcjunior@ufrj.br>To: MPICH Discuss <mpich-<br>
discuss@mcs.anl.gov>Subject: [mpich-discuss] Failed to allocate memory <br>
for an unexpected messageMessage-ID: <br>
<CAOv4ofRY4ajVZecZcDN3d3tdENV=XBMd=5i1TjX3310ZnEFUdg@mail.gmai<br>
l.com>Content-Type: text/plain; charset="iso-8859-1"<br>
<br>
Hi,<br>
<br>
I am getting the following error when running my parallel application:<br>
<br>
MPI_Recv(186)......................: MPI_Recv(buf=0x125bd840, <br>
count=2060,MPI_CHARACTER, src=24, tag=94, comm=0x84000002, <br>
status=0x125fcff0) <br>
failedMPIDI_CH3I_Progress(402)...........:MPID_nem_mpich2_blocking_recv(90<br>
5).:MPID_nem_tcp_connpoll(1838)........:state_commrdy_handler(1676)........<br>
:MPID_nem_tcp_recv_handler(1564)....:MPID_nem_handle_pkt(636)...........:M<br>
PIDI_CH3_PktHandler_EagerSend(606): Failed to allocate memory for <br>
anunexpected message. 261895 unexpected messages queued.Fatal <br>
error in MPI_Send: Other MPI error, error stack:MPI_Send(173)..............: <br>
MPI_Send(buf=0x765d2e60, count=2060,MPI_CHARACTER, dest=0, <br>
tag=94, comm=0x84000004) failedMPID_nem_tcp_connpoll(1826): <br>
Communication error with rank 1: Connectionreset by peer<br>
<br>
<br>
I went to MPICH's FAQ (<br>
<br>
<a href="http://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_Why">http://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_Why</a><br>
_am_I_getting_so_many_unexpected_messages.3F[3]<br>
<a href="http://lists.mpich.org/pipermail/discuss/attachments/20131023/3a02fa51/a">http://lists.mpich.org/pipermail/discuss/attachments/20131023/3a02fa51/a</a><br>
ttachment-0001.html[4]><br>
<br>
------------------------------<br>
<br>
Message: 4Date: Wed, 23 Oct 2013 14:42:15 -0500From: Antonio J. Pe?a <br>
<apenya@mcs.anl.gov>To: discuss@mpich.orgCc: MPICH Discuss <mpich-<br>
discuss@mcs.anl.gov>Subject: Re: [mpich-discuss] Failed to allocate <br>
memory for an unexpected messageMessage-ID: <br>
<1965559.SsluspJNke@localhost.localdomain>Content-Type: text/plain; <br>
charset="iso-8859-1"<br>
<br>
<br>
Hi Luiz,<br>
<br>
Your error trace indicates that the receiver went out of memory due to a <br>
too large amount (261,895) of eager unexpected messages received, i.e., <br>
small messages received without a matching receive operation. Whenever <br>
this happens, the receiver allocates a temporary buffer to hold the <br>
received message. This exhausted the available memory in the computer <br>
where the receiver was executing.<br>
<br>
To avoid this, try to pre-post receives before messages arrive. Indeed, this <br>
is far more efficient. Maybe you could do an MPI_IRecv per worker in your <br>
writer process, and process them after an MPI_Waitany. You may also <br>
consider having multiple writer processes if your use case permits and the <br>
volume of received messages is too high to be processed by a single <br>
writer.<br>
<br>
Antonio<br>
<br>
<br>
On Wednesday, October 23, 2013 05:27:27 PM Luiz Carlos da Costa Junior <br>
wrote:<br>
<br>
<br>
Hi,<br>
<br>
<br>
I am getting the following error when running my parallel application:<br>
<br>
<br>
MPI_Recv(186)......................: MPI_Recv(buf=0x125bd840, count=2060, <br>
MPI_CHARACTER, src=24, tag=94, comm=0x84000002, status=0x125fcff0) <br>
failed MPIDI_CH3I_Progress(402)...........: <br>
MPID_nem_mpich2_blocking_recv(905).: <br>
MPID_nem_tcp_connpoll(1838)........: state_commrdy_handler(1676)........: <br>
MPID_nem_tcp_recv_handler(1564)....: MPID_nem_handle_pkt(636)...........: <br>
MPIDI_CH3_PktHandler_EagerSend(606): Failed to allocate memory for an <br>
unexpected message. 261895 unexpected messages queued. Fatal error <br>
in MPI_Send: Other MPI error, error stack:MPI_Send(173)..............: <br>
MPI_Send(buf=0x765d2e60, count=2060, MPI_CHARACTER, dest=0, <br>
tag=94, comm=0x84000004) failed MPID_nem_tcp_connpoll(1826): <br>
Communication error with rank 1: Connection reset by peer <br>
<br>
<br>
I went to MPICH's FAQ <br>
(<a href=""></a>http://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_Wh<br>
y_am_I_getting_so_many_unexpected_messages.3F[1]). It says that most <br>
likely the receiver process can't cope to process the high number of <br>
messages it is receiving.<br>
-------------- next part --------------<br>
An HTML attachment was scrubbed...<br>
URL: <<a href="http://lists.mpich.org/pipermail/discuss/attachments/20131023/b13e2436/attachment.html">http://lists.mpich.org/pipermail/discuss/attachments/20131023/b13e2436/attachment.html</a>><br>
<br>
------------------------------<br>
<br>
_______________________________________________<br>
discuss mailing list<br>
discuss@mpich.org<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
<br>
End of discuss Digest, Vol 12, Issue 15<br>
***************************************<br>
</div>
</div>
</body>
</html>