<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix"><br>
Hi Amin,<br>
<br>
Can you share with us a minimal piece of code with which you can
reproduce this issue?<br>
<br>
Thanks,<br>
Antonio<br>
<br>
<br>
On 11/25/2014 12:52 PM, Amin Hassani wrote:<br>
</div>
<blockquote cite="mid:CAF2GiU=WGgcEGYQK00u7J7Jw3NbjwTsZf-RBM1mZsa3N2vv+Mw@mail.gmail.com" type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">Hi,</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">I am
having problem running MPICH, on multiple nodes. When I run an
multiple MPI processes on one node, it totally works, but when
I try to run on multiple nodes, it fails with the error below.</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">My
machines have Debian OS, Both infiniband and TCP
interconnects. I'm guessing it has something do to with the
TCP network, but I can run openmpi on these machines with no
problem. But for some reason I cannot run MPICH on multiple
nodes. Please let me know if more info is needed from my side.
I'm guessing there are some configuration that I am missing. I
used MPICH 3.1.3 for this test. I googled this problem but
couldn't find any solution.</div>
<div><br>
</div>
<div>
<div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">In my
MPI program, I am doing a simple allreduce over
MPI_COMM_WORLD.</div>
<br>
</div>
<div>
<div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">my
host file (hosts-hydra) is something like this:</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">oakmnt-0-a:1</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">oakmnt-0-b:1
</div>
</div>
<div><br>
</div>
<div>
<div class="gmail_default" style="font-family:tahoma,sans-serif;font-size:small">I get
this error:</div>
<br>
</div>
<div>
<div class="gmail_default" style="">
<div class="gmail_default"><font face="tahoma, sans-serif">$
mpirun -hostfile hosts-hydra -np 2 test_dup</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">Assertion
failed in file ../src/mpi/coll/helper_fns.c at line 490:
status->MPI_TAG == recvtag</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">Assertion
failed in file ../src/mpi/coll/helper_fns.c at line 490:
status->MPI_TAG == recvtag</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">internal
ABORT - process 1</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">internal
ABORT - process 0</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif"><br>
</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">===================================================================================</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">=
BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">=
PID 30744 RUNNING AT oakmnt-0-b</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">=
EXIT CODE: 1</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">=
CLEANING UP REMAINING PROCESSES</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">=
YOU CAN IGNORE THE BELOW CLEANUP MESSAGES</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">===================================================================================</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">[mpiexec@vulcan13]
HYDU_sock_read
(../../../../src/pm/hydra/utils/sock/sock.c:239): read
error (Bad file descriptor)</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">[mpiexec@vulcan13]
control_cb
(../../../../src/pm/hydra/pm/pmiserv/pmiserv_cb.c:199):
unable to read command from proxy</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">[mpiexec@vulcan13]
HYDT_dmxu_poll_wait_for_event
(../../../../src/pm/hydra/tools/demux/demux_poll.c:76):
callback returned error status</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">[mpiexec@vulcan13]
HYD_pmci_wait_for_completion
(../../../../src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:198):
error waiting for event</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">[mpiexec@vulcan13]
main (../../../../src/pm/hydra/ui/mpich/mpiexec.c:344):
process manager error waiting for completion</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif"><br>
</font></div>
<div class="gmail_default"><font face="tahoma, sans-serif">Thanks.</font></div>
</div>
</div>
<div>
<div class="gmail_signature">
<div dir="ltr">Amin Hassani,<br>
CIS department at UAB,<br>
Birmingham, AL, USA.</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
discuss mailing list <a class="moz-txt-link-abbreviated" href="mailto:discuss@mpich.org">discuss@mpich.org</a>
To manage subscription options or unsubscribe:
<a class="moz-txt-link-freetext" href="https://lists.mpich.org/mailman/listinfo/discuss">https://lists.mpich.org/mailman/listinfo/discuss</a></pre>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
Antonio J. Peña
Postdoctoral Appointee
Mathematics and Computer Science Division
Argonne National Laboratory
9700 South Cass Avenue, Bldg. 240, Of. 3148
Argonne, IL 60439-4847
<a class="moz-txt-link-abbreviated" href="mailto:apenya@mcs.anl.gov">apenya@mcs.anl.gov</a>
<a class="moz-txt-link-abbreviated" href="http://www.mcs.anl.gov/~apenya">www.mcs.anl.gov/~apenya</a></pre>
</body>
</html>