[mpich-devel] tackling "large datatype" change
Rob Latham
robl at mcs.anl.gov
Fri Jul 12 10:31:15 CDT 2013
On Fri, Jul 12, 2013 at 10:24:48AM -0500, Pavan Balaji wrote:
> Rob,
>
> Sorry, I'll look into this next week.
no need to apologize. I've been there -- twice.
> Amount of time left is proportional to age of baby.
>
> The first solution works for some cases in MPICH, but there are
> still many bugs. I think your ticket is to fix this part?
>
> The second solution is known not to work.
OK, I don't think there's any way to decouple these two solutions,
though.
There are lots of places in the datatype/dataloop code where
optimizations are applied : "oh these two contig types are next to
each other. Let's combine them into a single contig type... with a
count that overflows. "
==rob
> -- Pavan
>
> On 07/12/2013 06:25 AM, Rob Latham wrote:
> >I've been trying to tackle tt #1742, #1890, and #1893 as part of some
> >I/O work that uses large datatypes.
> >
> >Am I stepping on anyone's toes here? Pavan, I know i bugged you about these
> >tickets. Hope I didn't waste my Thursday on this...
> >
> >Pavan told me there are two problems here, but I think there are
> >really one solution to both:
> >
> >- large datatype means a datatype that describes more than 2 gigs of
> > data. A million contigs of a million contigs, say
> >
> >- large count means datatypes that use MPI_Count to say how many
> > elements they have: a contig of 3 billion MPI_BYTES, say
> >
> >Internally, though, there's no way to decouple those two problems.
> >
> >The changes are pervasive and make me really nervous.
> >
> >I've been sometimes pushing changes to the ticket-1742-bigio branch.
> >
> >Who can review datatype changes once I'm done?:
> >
> >==rob
> >
>
--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
More information about the devel
mailing list