<meta http-equiv="Content-Type" content="text/html; charset=utf-8"><div dir="ltr"><div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">Hi,</div><div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small"><br></div><div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">I am having problem running MPICH, on multiple nodes. When I run an multiple MPI processes on one node, it totally works, but when I try to run on multiple nodes, it fails with the error below.</div><div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">My machines have Debian OS, Both infiniband and TCP interconnects. I'm guessing it has something do to with the TCP network, but I can run openmpi on these machines with no problem. But for some reason I cannot run MPICH on multiple nodes. Please let me know if more info is needed from my side. I'm guessing there are some configuration that I am missing. I used MPICH 3.1.3 for this test. I googled this problem but couldn't find any solution.</div><div><br></div><div><div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">In my MPI program, I am doing a simple allreduce over MPI_COMM_WORLD.</div><br></div><div><div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">my host file (hosts-hydra) is something like this:</div><div class="gmail_default" style="font-family:tahoma,sans-serif">oakmnt-0-a:1</div><div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">oakmnt-0-b:1</div></div><div><br></div><div><div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">I get this error:</div><br></div><div><div class="gmail_default" style=""><div class="gmail_default"><font face="tahoma, sans-serif">$ mpirun -hostfile hosts-hydra -np 2 test_dup</font></div><div class="gmail_default"><font face="tahoma, sans-serif">Assertion failed in file ../src/mpi/coll/helper_fns.c at line 490: status->MPI_TAG == recvtag</font></div><div class="gmail_default"><font face="tahoma, sans-serif">Assertion failed in file ../src/mpi/coll/helper_fns.c at line 490: status->MPI_TAG == recvtag</font></div><div class="gmail_default"><font face="tahoma, sans-serif">internal ABORT - process 1</font></div><div class="gmail_default"><font face="tahoma, sans-serif">internal ABORT - process 0</font></div><div class="gmail_default"><font face="tahoma, sans-serif"><br></font></div><div class="gmail_default"><font face="tahoma, sans-serif">===================================================================================</font></div><div class="gmail_default"><font face="tahoma, sans-serif">= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES</font></div><div class="gmail_default"><font face="tahoma, sans-serif">= PID 30744 RUNNING AT oakmnt-0-b</font></div><div class="gmail_default"><font face="tahoma, sans-serif">= EXIT CODE: 1</font></div><div class="gmail_default"><font face="tahoma, sans-serif">= CLEANING UP REMAINING PROCESSES</font></div><div class="gmail_default"><font face="tahoma, sans-serif">= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES</font></div><div class="gmail_default"><font face="tahoma, sans-serif">===================================================================================</font></div><div class="gmail_default"><font face="tahoma, sans-serif">[mpiexec@vulcan13] HYDU_sock_read (../../../../src/pm/hydra/utils/sock/sock.c:239): read error (Bad file descriptor)</font></div><div class="gmail_default"><font face="tahoma, sans-serif">[mpiexec@vulcan13] control_cb (../../../../src/pm/hydra/pm/pmiserv/pmiserv_cb.c:199): unable to read command from proxy</font></div><div class="gmail_default"><font face="tahoma, sans-serif">[mpiexec@vulcan13] HYDT_dmxu_poll_wait_for_event (../../../../src/pm/hydra/tools/demux/demux_poll.c:76): callback returned error status</font></div><div class="gmail_default"><font face="tahoma, sans-serif">[mpiexec@vulcan13] HYD_pmci_wait_for_completion (../../../../src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:198): error waiting for event</font></div><div class="gmail_default"><font face="tahoma, sans-serif">[mpiexec@vulcan13] main (../../../../src/pm/hydra/ui/mpich/mpiexec.c:344): process manager error waiting for completion</font></div><div class="gmail_default"><font face="tahoma, sans-serif"><br></font></div><div class="gmail_default"><font face="tahoma, sans-serif">Thanks.</font></div></div></div><div><div class="gmail_signature"><div dir="ltr">Amin Hassani,<br>CIS department at UAB,<br>
Birmingham, AL, USA.</div></div></div>
</div>