[mpich-discuss] Fatal error in PMPI_Isend: Internal MPI error!, error stack

Pavan Balaji balaji at mcs.anl.gov
Sun Dec 8 23:04:00 CST 2013


Hello,

Can you provide a simple program that reproduces this error?  Also, please make sure you are using the latest version of mpich — either 3.0.4 or 3.1rc2.

Thanks,

  — Pavan

On Dec 8, 2013, at 10:50 PM, 박보국 <limited107 at gmail.com> wrote:

> Hi,
> I'm trying to run Titan2D on a linux. I found a similar problem on a different system. ( Ubuntu PC / CentOS Cluster )
> Small operations, the error does not occur. However, In the case of long simulation error occur.
> 
> The error message is shown below.
> Fatal error in PMPI_Isend: Internal MPI error!, error stack:
> PMPI_Isend(148): MPI_Isend(buf=0x244c53c0, count=45, dtype=USER<struct>, dest=13, tag=22676, MPI_COMM_WORLD, request=0x227ab074) failed
> (unknown)(): Internal MPI error!
> Fatal error in PMPI_Test: Other MPI error, error stack:
> PMPI_Test(168)............: MPI_Test(request=0x147a6068, flag=0x7fff9bd5098c, status=0x7fff9bd50960) failed
> MPIR_Test_impl(63)........: 
> dequeue_and_set_error(596): Communication error with rank 2
> Fatal error in PMPI_Test: Other MPI error, error stack:
> PMPI_Test(168)............: MPI_Test(request=0x2deaf058, flag=0x7fff88bd375c, status=0x7fff88bd3730) failed
> MPIR_Test_impl(63)........: 
> dequeue_and_set_error(596): Communication error with rank 2
> Fatal error in PMPI_Barrier: Other MPI error, error stack:
> PMPI_Barrier(425)...........: MPI_Barrier(MPI_COMM_WORLD) failed
> MPIR_Barrier_impl(292)......: 
> MPIR_Barrier_or_coll_fn(121): 
> MPIR_Barrier_intra(83)......: 
> dequeue_and_set_error(596)..: Communication error with rank 10
> Fatal error in PMPI_Barrier: Other MPI error, error stack:
> PMPI_Barrier(425)...........: MPI_Barrier(MPI_COMM_WORLD) failed
> MPIR_Barrier_impl(306)......: 
> MPIR_Bcast_impl(1321).......: 
> MPIR_Bcast_intra(1155)......: 
> MPIR_Bcast_binomial(213)....: Failure during collective
> MPIR_Barrier_impl(292)......: 
> MPIR_Barrier_or_coll_fn(121): 
> MPIR_Barrier_intra(83)......: 
> dequeue_and_set_error(596)..: Communication error with rank 0
> Fatal error in PMPI_Isend: Other MPI error, error stack:
> PMPI_Isend(148)..........: MPI_Isend(buf=0x282dbdc0, count=260, dtype=USER<struct>, dest=10, tag=22683, MPI_COMM_WORLD, request=0x26b3a9b8) failed
> MPID_nem_lmt_RndvSend(81): 
> MPIDI_CH3_RndvSend(63)...: failure occurred while attempting to send RTS packet
> MPIDI_CH3_iStartMsg(36)..: Communication error with rank 10
> Fatal error in PMPI_Test: Other MPI error, error stack:
> PMPI_Test(168)..................: MPI_Test(request=0xa66b46c, flag=0x7fff849b92dc, status=0x7fff849b92b0) failed
> MPIR_Test_impl(63)..............: 
> MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 3
> Thanks,
> Park Bokuk
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

--
Pavan Balaji
http://www.mcs.anl.gov/~balaji




More information about the discuss mailing list