<html><head><meta http-equiv="Content-Type" content="text/html charset=iso-8859-1"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><br><div style=""><br><div>Begin forwarded message:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;"><span style="font-family:'Helvetica'; color:rgba(0, 0, 0, 1.0);"><b>From: </b></span><span style="font-family:'Helvetica';">"Steven C. Pieper" <<a href="mailto:spieper@anl.gov">spieper@anl.gov</a>><br></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;"><span style="font-family:'Helvetica'; color:rgba(0, 0, 0, 1.0);"><b>Subject: </b></span><span style="font-family:'Helvetica';"><b>suggestion for mpich diagnostic</b><br></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;"><span style="font-family:'Helvetica'; color:rgba(0, 0, 0, 1.0);"><b>Date: </b></span><span style="font-family:'Helvetica';">June 4, 2014 at 3:14:09 PM CDT<br></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;"><span style="font-family:'Helvetica'; color:rgba(0, 0, 0, 1.0);"><b>To: </b></span><span style="font-family:'Helvetica';">Kenneth Raffenetti <<a href="mailto:raffenet@mcs.anl.gov">raffenet@mcs.anl.gov</a>>, Rusty Lusk <<a href="mailto:lusk@mcs.anl.gov">lusk@mcs.anl.gov</a>><br></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;"><span style="font-family:'Helvetica'; color:rgba(0, 0, 0, 1.0);"><b>Reply-To: </b></span><span style="font-family:'Helvetica';"><<a href="mailto:spieper@anl.gov">spieper@anl.gov</a>><br></span></div><br><div>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<div bgcolor="#FFFFFF" text="#000000">
<font face="Lucida Sans Typewriter">I have just found a bug in my
mpi code. On blues the<br>
bug resulted in the following single stderr line<br>
<br>
Fatal error in MPI_Recv: Message truncated<br>
<br>
with no clue as to where it happened. Fortunately I had<br>
noticed that in the latest production mpich that I installed<br>
(with Ken's help) on my cluster I get more info. Indeed<br>
there I got the following:<br>
<br>
Fatal error in MPI_Recv: Message truncated, error stack:<br>
MPI_Recv(184).......................: MPI_Recv(buf=0x4036510,
count=1, dtype=USER<contig>, src=MPI_ANY_SOURCE, tag=101,
comm=0x84000002, status=0xb3b920) failed<br>
MPIDI_CH3U_Request_unpack_uebuf(605): Message truncated; 29292
bytes received but buffer size is 488<br>
<br>
The tag=101 was enough in this case for me to locate the error.<br>
But a little more info would be useful:<br>
<br>
1) The rank of the process with the failing receive.<br>
2) The actual rank of the sending process.<br>
3) If at all possible, trigger a traceback and coredump of the<br>
rank with the failing receive.<br>
<br>
Steve<br>
</font>
<pre class="moz-signature" cols="72">--
Steven C. Pieper: <a class="moz-txt-link-abbreviated" href="mailto:spieper@anl.gov">spieper@anl.gov</a>
Argonne National Laboratory, Physics Division, Bldg. 203, Argonne, IL 60439
Phone: 630-252-4232 Fax -6008
Secretary, Debra Morrison, -4100
</pre>
</div>
</div></blockquote></div><br></body></html>