[mpich-devel] tackling "large datatype" change

Fri Jul 12 10:28:13 CDT 2013

On Fri, Jul 12, 2013 at 03:13:37PM +0000, David Goodell (dgoodell) wrote:
> Jeff,
> 
> How could this possibly work correctly with broken (i.e., uses "int"s internally for certain calculations) datatype and communication engines?
> 
> The hard part of large count types is not constructing the actual type, it's processing that type correctly throughout the entire stack...

Exactly.  And it's not "certain" calculations.  Ints are used
pervasively.  

So you have MPIX_Type_contiguous_x(2346319872, MPI_BYTE, &bigtype)

How do you feed that to underlying MPI?

You can make a contig of a million MPI_BYTES.  Then you make a struct
type of 2237 "megabyte" types and 655360 MPI_BYTE types.

Now what?  Oh, MPICH blows up.  (patch 1: make mpich not blow up)

OK, now you get a type with a size of -1948647424 and you go crazy
finding all the places where "this much data" is stuffed into an int.

I think that correctly summarizes my Thursday...

Anyway, this is scutt work -- necesary and welcome, but not research
by any means.  I'd hate for anyone else to duplicate work in this
area, hence my declaration of intent.

I have no idea how to partition this if someone else wanted to chip
away at the problem.

==rob

> 
> -Dave
> 
> On Jul 12, 2013, at 9:57 AM, Jeff Hammond <jhammond at alcf.anl.gov> wrote:
> 
> > Hi Rob,
> > 
> > I started working on something to address this a few weeks ago.  I was going to write a portable (i.e. above MPI) implementation of 
> > 
> > MPIX_Type_contiguous_x(MPI_Count count, MPI_Datatype old_type, MPI_Datatype *newtype)
> > 
> > and then reimplement it efficiently in MPICH.  I will also try to get JeffS or DaveG to create it for OtherMPI so e.g. PETSc can use it more broadly.
> > 
> > Does this sound like an okay plan to you?  I've not pushed the latest code to Github but when I do, I'll send you the link.
> > 
> > Best,
> > 
> > Jeff
> > 
> > 
> > ----- Original Message -----
> >> From: "Rob Latham" <robl at mcs.anl.gov>
> >> To: devel at mpich.org
> >> Sent: Friday, July 12, 2013 6:25:14 AM
> >> Subject: [mpich-devel] tackling "large datatype" change
> >> 
> >> I've been trying to tackle tt #1742, #1890, and #1893 as part of some
> >> I/O work that uses large datatypes.
> >> 
> >> Am I stepping on anyone's toes here?  Pavan, I know i bugged you
> >> about these
> >> tickets.  Hope I didn't waste my Thursday on this...
> >> 
> >> Pavan told me there are two problems here, but I think there are
> >> really one solution to both:
> >> 
> >> - large datatype means a datatype that describes more than 2 gigs of
> >>  data.  A million contigs of a million contigs, say
> >> 
> >> - large count means datatypes that use MPI_Count to say how many
> >>  elements they have:  a contig of 3 billion MPI_BYTES, say
> >> 
> >> Internally, though, there's no way to decouple those two problems.
> >> 
> >> The changes are pervasive and make me really nervous.
> >> 
> >> I've been sometimes pushing changes to the ticket-1742-bigio branch.
> >> 
> >> Who can review datatype changes once I'm done?:
> >> 
> >> ==rob
> >> 
> >> --
> >> Rob Latham
> >> Mathematics and Computer Science Division
> >> Argonne National Lab, IL USA
> >> 
> > 
> > -- 
> > Jeff Hammond
> > Argonne Leadership Computing Facility
> > University of Chicago Computation Institute
> > jhammond at alcf.anl.gov / (630) 252-5381
> > http://www.linkedin.com/in/jeffhammond
> > https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
> > ALCF docs: http://www.alcf.anl.gov/user-guides
> 

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA