[mpich-devel] tackling "large datatype" change

David Goodell (dgoodell) dgoodell at cisco.com
Fri Jul 12 10:13:37 CDT 2013


Jeff,

How could this possibly work correctly with broken (i.e., uses "int"s internally for certain calculations) datatype and communication engines?

The hard part of large count types is not constructing the actual type, it's processing that type correctly throughout the entire stack...

-Dave

On Jul 12, 2013, at 9:57 AM, Jeff Hammond <jhammond at alcf.anl.gov> wrote:

> Hi Rob,
> 
> I started working on something to address this a few weeks ago.  I was going to write a portable (i.e. above MPI) implementation of 
> 
> MPIX_Type_contiguous_x(MPI_Count count, MPI_Datatype old_type, MPI_Datatype *newtype)
> 
> and then reimplement it efficiently in MPICH.  I will also try to get JeffS or DaveG to create it for OtherMPI so e.g. PETSc can use it more broadly.
> 
> Does this sound like an okay plan to you?  I've not pushed the latest code to Github but when I do, I'll send you the link.
> 
> Best,
> 
> Jeff
> 
> 
> ----- Original Message -----
>> From: "Rob Latham" <robl at mcs.anl.gov>
>> To: devel at mpich.org
>> Sent: Friday, July 12, 2013 6:25:14 AM
>> Subject: [mpich-devel] tackling "large datatype" change
>> 
>> I've been trying to tackle tt #1742, #1890, and #1893 as part of some
>> I/O work that uses large datatypes.
>> 
>> Am I stepping on anyone's toes here?  Pavan, I know i bugged you
>> about these
>> tickets.  Hope I didn't waste my Thursday on this...
>> 
>> Pavan told me there are two problems here, but I think there are
>> really one solution to both:
>> 
>> - large datatype means a datatype that describes more than 2 gigs of
>>  data.  A million contigs of a million contigs, say
>> 
>> - large count means datatypes that use MPI_Count to say how many
>>  elements they have:  a contig of 3 billion MPI_BYTES, say
>> 
>> Internally, though, there's no way to decouple those two problems.
>> 
>> The changes are pervasive and make me really nervous.
>> 
>> I've been sometimes pushing changes to the ticket-1742-bigio branch.
>> 
>> Who can review datatype changes once I'm done?:
>> 
>> ==rob
>> 
>> --
>> Rob Latham
>> Mathematics and Computer Science Division
>> Argonne National Lab, IL USA
>> 
> 
> -- 
> Jeff Hammond
> Argonne Leadership Computing Facility
> University of Chicago Computation Institute
> jhammond at alcf.anl.gov / (630) 252-5381
> http://www.linkedin.com/in/jeffhammond
> https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
> ALCF docs: http://www.alcf.anl.gov/user-guides



More information about the devel mailing list