[mpich-discuss] MPID_nem_tcp_connpoll(1835): Communication error with rank 1: Connection timed out

amelie chi zhou amelie.czhou at gmail.com
Tue Mar 15 06:16:41 CDT 2016


Hi,

I configured two virtual machines on Amazon EC2 to run mpich-3.2. The
system is Ubuntu 12.04.2 LTS.

The two virtual machines can ssh to each other successfully (passwordless)
and I can run a simple hello world program using the two machines.

ubuntu at ip-10-169-125-85:~$ mpiexec -n 2 -f host_file ./hello_world
Hello world from processor ip-10-169-125-85, rank 1 out of 2 processors
Hello world from processor ip-10-235-37-156, rank 0 out of 2 processors

Then I run a simple program with MPI_Send and MPI_Receive to communicate
between the two vms. Following are the core code of the program.

 if (world_rank == 0) {
    // If we are rank 0, set the number to -1 and send it to process 1
    number = -1;
    MPI_Send(&number, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
  } else if (world_rank == 1) {
    MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    printf("Process 1 received number %d from process 0\n", number);
  }


Following are the error msg I encountered.

ubuntu at ip-10-169-125-85:~$ mpiexec -n 2 -f host_file ./send_recv
Fatal error in MPI_Send: Unknown error class, error stack:
MPI_Send(174)..............: MPI_Send(buf=0x7fff49f2759c, count=1, MPI_INT,
dest=1, tag=0, MPI_COMM_WORLD) failed
MPID_nem_tcp_connpoll(1835): Communication error with rank 1: Connection
timed out


I googled similar errors and have made sure that: 1) there is no rule in my
firewall setting, 2) there is a tcp port listening on both sides when the
send_recv program runs. I cannot think of any other possible way to fix
this problem. BTW, the two virtual machines are on two different regions of
Amazon EC2 and are not in VPCs. Please help. Thanks!

Regards,
Amelie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20160315/8dcb0db0/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list