[mpich-discuss] Fwd: suggestion for mpich diagnostic
Rusty Lusk
lusk at mcs.anl.gov
Wed Jun 4 15:40:51 CDT 2014
Begin forwarded message:
> From: "Steven C. Pieper" <spieper at anl.gov>
> Subject: suggestion for mpich diagnostic
> Date: June 4, 2014 at 3:14:09 PM CDT
> To: Kenneth Raffenetti <raffenet at mcs.anl.gov>, Rusty Lusk <lusk at mcs.anl.gov>
> Reply-To: <spieper at anl.gov>
>
> I have just found a bug in my mpi code. On blues the
> bug resulted in the following single stderr line
>
> Fatal error in MPI_Recv: Message truncated
>
> with no clue as to where it happened. Fortunately I had
> noticed that in the latest production mpich that I installed
> (with Ken's help) on my cluster I get more info. Indeed
> there I got the following:
>
> Fatal error in MPI_Recv: Message truncated, error stack:
> MPI_Recv(184).......................: MPI_Recv(buf=0x4036510, count=1, dtype=USER<contig>, src=MPI_ANY_SOURCE, tag=101, comm=0x84000002, status=0xb3b920) failed
> MPIDI_CH3U_Request_unpack_uebuf(605): Message truncated; 29292 bytes received but buffer size is 488
>
> The tag=101 was enough in this case for me to locate the error.
> But a little more info would be useful:
>
> 1) The rank of the process with the failing receive.
> 2) The actual rank of the sending process.
> 3) If at all possible, trigger a traceback and coredump of the
> rank with the failing receive.
>
> Steve
> --
> Steven C. Pieper: spieper at anl.gov
> Argonne National Laboratory, Physics Division, Bldg. 203, Argonne, IL 60439
> Phone: 630-252-4232 Fax -6008
> Secretary, Debra Morrison, -4100
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140604/f69dcd51/attachment.html>
More information about the discuss
mailing list