[mpich-discuss] error with MPI_Reduce running cpi

Zhou, Hui zhouh at anl.gov
Mon Jun 17 10:58:33 CDT 2019


Hi Jinang_Shah,

Could you list your configure line ( try `head config.log`)?

If you try a simple example where process 0 sends a short message to process 1, would you result in similar error?

Can csews1 and csews2 connect to each other freely, i.e. is there firewall, router, etc. between these two hosts?

—
Hui Zhou









On Jun 17, 2019, at 10:11 AM, Jinang_Shah via discuss <discuss at mpich.org<mailto:discuss at mpich.org>> wrote:


$ mpiexec -n 2 -f hostfile ./mpi/mpich-3.3.1/examples/cpi
Process 1 of 2 is on csews2
Process 0 of 2 is on csews1
Fatal error in PMPI_Reduce: Unknown error class, error stack:
PMPI_Reduce(523)................: MPI_Reduce(sbuf=0x7ffeab580c10, rbuf=0x7ffeab580c18, count=1, datatype=MPI_DOUBLE, op=MPI_SUM, root=0, comm=MPI_COMM_WORLD) failed
PMPI_Reduce(509)................:
MPIR_Reduce_impl(316)...........:
MPIR_Reduce_intra_auto(231).....:
MPIR_Reduce_intra_binomial(125).:
MPIDI_CH3U_Recvq_FDU_or_AEP(629): Communication error with rank 1


Can anyone explain this error and how to overcome it.I have installed just MPICH and not HYDRA package.

By the way this same command works fine with hello program the one where each node says their identity.

_______________________________________________
discuss mailing list     discuss at mpich.org<mailto:discuss at mpich.org>
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20190617/8926338a/attachment.html>


More information about the discuss mailing list