[mpich-discuss] Hanging behavior with derived types in a 'user-defined gatherv'

Jeff Hammond jeff.science at gmail.com
Fri Apr 21 17:26:31 CDT 2017


On Fri, Apr 21, 2017 at 1:31 PM, Latham, Robert J. <robl at mcs.anl.gov> wrote:
>
> On Fri, 2017-04-21 at 16:56 +0000, Sewall, Jason wrote:
> > Folks,
> >
> > I have been working on some code that does something akin to a
> > 'manual' implementation of gatherv with derived types. I've run into
> > a case where some requests never complete. I think it's a bug in
> > MPICH.:
>
> good news, maybe?
>
> I can't reproduce this with today's MPICH .  I get some debug-logging
> warnings that you didn't free two of your types, but it doesn't hang on
> my laptop.  Those datatype-related allocations are the only valgrind
> error I see when I run "mpiexec -np 3 ./mpi-gather 256 256"
>

I was able to reproduce it with d8e9213647e782b77a1e72fde6a7638198ac3be5 on
Mac, but only when I ran with MPIR_CVAR_NOLOCAL=1.

Jeff

>
> Is it possible a 256 by 256 grids could overflow an integer anywhere?
> I think master has some integer overflow fixes in the gather path that
> might not have made it into an MPICH release.  Aside from that, I'd
> have to dig through the history to figure out what might be different.
>
> ==rob
>
> >
> > My minimal code example is still ~300 lines long, so I've stuck it,
> > as well as a run script and some example output here:
> >
> > https://gist.github.com/jasonsewall/0d1fb12a93e157795786e560733cbf0c
> >
> > I'm new to this list, so if there's a better place/practice for
> > sharing code, please let me know.
> >
> > The code mimics a decomposition of a multi-field grid in 2
> > dimensions. Each grid has a halo region allocated, and this gather is
> > only interested in transmitting the interior of each grid.
> >
> > The conditions, as far as I can tell, are related to:
> >
> > 1. nranks > 2
> > 2. 'larger' messages: I haven't fully bisected it, but 256x128 grids
> > work, and 256x256 don't.
> > 3. Having halos. With the global integer halow = 2, the strides for
> > the grids (and the resulting hvector derived types) exceed the
> > lengths, and that seems to be important; if you set halow = 0 and
> > run, it completes, independent of the other factors.
> >
> > The test program uses Waitsome to figure out which requests are
> > completing, and it looks like rank 0 fails to complete a recv from
> > rank 2, despite rank 2 completing its send.
> >
> > You can fiddle with the hacky 'mask' parameter to have a subset of
> > the ranks send (line 326). -1ULL means all ranks are allowed in, but
> > you get interesting results with 6ULL, which has just ranks 1 and 2
> > participate in sending, but rank 0 still hangs on the recv from rank
> > 2. (Obviously the correctness test will fail here, since it expects
> > results from all ranks).
> >
> > I discovered this on Intel MPI, and thanks to my colleague Jeff
> > Hammond's analysis, was able to reproduce the behavior on MPICH 3.2.
> > OpenMPI 2.1.0 successfully completes every case I have tried.
> >
> > The obvious caveat here is that there could very well be a bug in how
> > I'm using the derived datatypes, but I have tried to carefully check
> > that I'm not trashing random memory. Things seem correct.
> >
> > Please let me know what I can do help pinpoint the problem. I'm not
> > familiar with the implementation details of any MPI libraries.
> >
> > Cheers,
> > Jason
> >
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss




--
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20170421/79e5e903/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list