[mpich-discuss] Deadlock in MPI_Ibarrier over ch3:sock

Jed Brown jedbrown at mcs.anl.gov
Wed Jan 23 15:17:53 CST 2013


On Wed, Jan 23, 2013 at 2:44 PM, Dave Goodell <goodell at mcs.anl.gov> wrote:

> Hmm… it looks like we aren't poking NBC progress properly when a "test" or
> "iprobe" routine is called.  MPIDI_CH3i_Progress_test is missing a call to
> MPIDU_Sched_progress:
> http://git.mpich.org/mpich.git/blob/refs/heads/master:/src/mpid/ch3/channels/sock/src/ch3_progress.c#l51
>

The attached code demonstrates this. With ch3:sock, it completes with
MPI_Wait, but not with MPI_Test.

$ ~/usr/mpich-sock/bin/mpicc ibarrier.c -o ibarrier
$ mpirun.hydra -n 2 ./ibarrier

[0] MPI_Test: 0
[1] MPI_Test: 0
[0] MPI_Test: 1
[1] MPI_Test: 1
[0] MPI_Test: 2
[1] MPI_Test: 2
...


>
>
> I'm surprised this isn't caught by the "coll/nonblock3" test, although we
> may just be making too much progress some other way, or we may not have
> enough coverage of the test itself in that test:
> http://git.mpich.org/mpich.git/blob/refs/heads/master:/test/mpi/coll/nonblocking3.c
>
> A modest test case would be very helpful.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130123/ac9292d2/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ibarrier.c
Type: text/x-csrc
Size: 490 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130123/ac9292d2/attachment.bin>


More information about the discuss mailing list