[mpich-discuss] Hanging behavior with derived types in a 'user-defined gatherv'

Sewall, Jason jason.sewall at intel.com
Fri Apr 21 11:56:20 CDT 2017


Folks,

I have been working on some code that does something akin to a 'manual' implementation of gatherv with derived types. I've run into a case where some requests never complete. I think it's a bug in MPICH.

My minimal code example is still ~300 lines long, so I've stuck it, as well as a run script and some example output here:

https://gist.github.com/jasonsewall/0d1fb12a93e157795786e560733cbf0c

I'm new to this list, so if there's a better place/practice for sharing code, please let me know.

The code mimics a decomposition of a multi-field grid in 2 dimensions. Each grid has a halo region allocated, and this gather is only interested in transmitting the interior of each grid. 

The conditions, as far as I can tell, are related to:

1. nranks > 2
2. 'larger' messages: I haven't fully bisected it, but 256x128 grids work, and 256x256 don't.
3. Having halos. With the global integer halow = 2, the strides for the grids (and the resulting hvector derived types) exceed the lengths, and that seems to be important; if you set halow = 0 and run, it completes, independent of the other factors. 

The test program uses Waitsome to figure out which requests are completing, and it looks like rank 0 fails to complete a recv from rank 2, despite rank 2 completing its send.  

You can fiddle with the hacky 'mask' parameter to have a subset of the ranks send (line 326). -1ULL means all ranks are allowed in, but you get interesting results with 6ULL, which has just ranks 1 and 2 participate in sending, but rank 0 still hangs on the recv from rank 2. (Obviously the correctness test will fail here, since it expects results from all ranks).

I discovered this on Intel MPI, and thanks to my colleague Jeff Hammond's analysis, was able to reproduce the behavior on MPICH 3.2. OpenMPI 2.1.0 successfully completes every case I have tried.

The obvious caveat here is that there could very well be a bug in how I'm using the derived datatypes, but I have tried to carefully check that I'm not trashing random memory. Things seem correct.

Please let me know what I can do help pinpoint the problem. I'm not familiar with the implementation details of any MPI libraries.

Cheers,
Jason
 
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list