[mpich-devel] is sublinear memory usage in MPIR_Group_create (and related ops) possible?
Dave Goodell
goodell at mcs.anl.gov
Sat Feb 23 10:31:57 CST 2013
On Feb 22, 2013, at 5:47 PM CST, Jeff Hammond wrote:
> I see this error on BGQ at scale (nproc=524288).
[…]
> MPIR_Group_create(83)...: Unable to allocate 8388608 bytes of memory
> for newgroup->lrank_to_lpid (probably out of memory)
[…]
> I haven't thought about it very deeply yet, but is there a way to
> implement group operations without using O(p) memory?
That depends on the creation/access patterns you want to support and the number of hoops that you want to jump through. Jesper published a very reasonable scheme a few years ago for implementing these efficiently in both time and space:
Google Scholar page: http://scholar.google.com/scholar?cluster=13915175648622039304&hl=en&as_sdt=1,14
Slides: https://fs.hlrs.de/projects/eurompi2010/TALKS/TUESDAY_MORNING_TRACK1/CompactMPI2010_jesper_larsson_traeff.pdf
I can't quickly find an electronic copy to which I actually have access, though I think I have one in dead-tree form in my office.
The MPICH group code is pretty horrible right now. But it's not a heavily used piece of the implementation, so it's usually been hard to justify improving/fixing it.
-Dave
More information about the devel
mailing list