[mpich-discuss] Error running examples after install
Balaji, Pavan
balaji at anl.gov
Sat Jan 24 13:14:43 CST 2015
Sounds like a network setup issue, such as a firewall or /etc/host file description. Did you look through the FAQ entry on this?
http://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_My_MPI_program_aborts_with_an_error_saying_it_cannot_communicate_with_other_processes
-- Pavan
> On Jan 24, 2015, at 12:58 PM, Tiago dos Santos <santos.tmd at gmail.com> wrote:
>
> Hello everyone,
>
> After installing mpich, I ran the examples and keep getting this stack error:
>
> tds at ubuntu:~/Downloads/mpich-3.1.3$ mpiexec -f machinefile -n 2 ./examples/cpi
> Warning: Permanently added the ECDSA host key for IP address '192.168.201.138' to the list of known hosts.
> Process 0 of 2 is on ubuntu
> Fatal error in PMPI_Reduce: Unknown error class, error stack:
> PMPI_Reduce(1263)...............: MPI_Reduce(sbuf=0x7fff3f86da00, rbuf=0x7fff3f86da08, count=1, MPI_DOUBLE, MPI_SUM, root=0, MPI_COMM_WORLD) failed
> MPIR_Reduce_impl(1075)..........:
> MPIR_Reduce_intra(881)..........:
> MPIR_Reduce_binomial(188).......:
> MPIDI_CH3U_Recvq_FDU_or_AEP(636): Communication error with rank 1
>
> ===================================================================================
> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> = PID 7563 RUNNING AT ubuntu
> = EXIT CODE: 1
> = CLEANING UP REMAINING PROCESSES
> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> ===================================================================================
> [proxy:0:1 at ubuntu-clone] HYD_pmcd_pmip_control_cmd_cb (pm/pmiserv/pmip_cb.c:885): assert (!closed) failed
> [proxy:0:1 at ubuntu-clone] HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:76): callback returned error status
> [proxy:0:1 at ubuntu-clone] main (pm/pmiserv/pmip.c:206): demux engine error waiting for event
> [mpiexec at ubuntu] HYDT_bscu_wait_for_completion (tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated badly; aborting
> [mpiexec at ubuntu] HYDT_bsci_wait_for_completion (tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
> [mpiexec at ubuntu] HYD_pmci_wait_for_completion (pm/pmiserv/pmiserv_pmci.c:218): launcher returned error waiting for completion
> [mpiexec at ubuntu] main (ui/mpich/mpiexec.c:344): process manager error waiting for completion
>
>
> Since I’m pretty new to the MPI world, I kinda can’t get what did I did wrong - Did I do something wrong with ssh? Was it something else?
>
> System Specifications:
> - Ubuntu 14.04 64 bits
> - gcc version 4.8.2
> - While installing, fortran support was disable
> - This system is running on a virtual machine
>
>
> Network Specification:
> - Two machines with the specifications above in a private virtual network
> - One machine is called ubuntu and the other one is ubuntu-clone
>
> Host Files:
> - ubuntu
> - ubuntu
> - ubuntu-clone
> - ubuntu-clone
> - ubuntu-clone
> - ubuntu
>
>
> As you can see, the stack trace is from a command running on host ubuntu. The same error is razed when I run the same command on host ubuntu-clone.
> Can anyone help me getting where I messed up?
>
> Thanks in advance
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
--
Pavan Balaji ✉️
http://www.mcs.anl.gov/~balaji
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list