[mpich-discuss] Fwd: suggestion for mpich diagnostic

Rusty Lusk lusk at mcs.anl.gov
Wed Jun 4 15:40:51 CDT 2014



Begin forwarded message:

> From: "Steven C. Pieper" <spieper at anl.gov>
> Subject: suggestion for mpich diagnostic
> Date: June 4, 2014 at 3:14:09 PM CDT
> To: Kenneth Raffenetti <raffenet at mcs.anl.gov>, Rusty Lusk <lusk at mcs.anl.gov>
> Reply-To: <spieper at anl.gov>
> 
> I have just found a bug in my mpi code.  On blues the
> bug resulted in the following single stderr line
> 
> Fatal error in MPI_Recv: Message truncated
> 
> with no clue as to where it happened.  Fortunately I had
> noticed that in the latest production mpich that I installed
> (with Ken's help) on my cluster I get more info.  Indeed
> there I got the following:
> 
> Fatal error in MPI_Recv: Message truncated, error stack:
> MPI_Recv(184).......................: MPI_Recv(buf=0x4036510, count=1, dtype=USER<contig>, src=MPI_ANY_SOURCE, tag=101, comm=0x84000002, status=0xb3b920) failed
> MPIDI_CH3U_Request_unpack_uebuf(605): Message truncated; 29292 bytes received but buffer size is 488
> 
> The tag=101  was enough in this case for me to locate the error.
> But a little more info would be useful:
> 
> 1)  The rank of the process with the failing receive.
> 2)  The actual rank of the sending process.
> 3)  If at all possible, trigger a traceback and coredump of the
>       rank with the failing receive.
> 
> Steve
>  -- 
> Steven C. Pieper:  spieper at anl.gov 
> Argonne National Laboratory, Physics Division, Bldg. 203, Argonne, IL 60439
> Phone:  630-252-4232         Fax -6008
> Secretary, Debra Morrison, -4100

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140604/f69dcd51/attachment.html>


More information about the discuss mailing list