[mpich-discuss] From file reading to memory sharing

Dorier, Matthieu mdorier at anl.gov
Wed Aug 12 12:36:16 CDT 2015


Hi,

I'm gessing the cause of the performance issue is the large number of small requests, because when I read the full 7GB using a single process issuing one single MPI_File_read, it gets much better.

Matthieu
________________________________________
From: Wei-keng Liao [wkliao at eecs.northwestern.edu]
Sent: Wednesday, August 12, 2015 11:56 AM
To: discuss at mpich.org
Subject: Re: [mpich-discuss] From file reading to memory sharing

Hi, Matthieu

If you have "a series of subdomains (blocks) to load from the file",
one way to get a better I/O performance is to concatenate the filetypes
into a single one and then use only one MPI_File_set_view and one
MPI_File_read_all to read the data.

FYI. This request aggregation strategy is also used in PnetCDF.

Wei-keng

On Aug 12, 2015, at 11:27 AM, Dorier, Matthieu wrote:

> Hi,
>
> I'm trying to refactor an MPI code using MPI one-sided communications.
>
> The initial version of the code reads its data from a file containing a 3D array of floats. Each process has a series of subdomains (blocks) to load from the file, so they all open the file and then issue a series of MPI_File_set_view and MPI_File_read. The type passed to MPI_File_set_view is constructed using MPI_Type_create_subarray to match the block that needs to be loaded.
>
> This code performs very poorly even at small scale: the file is 7GB but the blocks are a few hundreds of bytes, and each process has many blocks to load.
>
> Instead, I would like to have process rank 0 load the entire file, then expose it over RMA. I'm not familiar at all with MPI one-sided operations, since I never used them before, but I guess there should be a simple way to reuse the subarray datatype of my MPI_File_set_view and use it in the context of an MPI_Get. I'm just not sure what the arguments of this MPI_Get would be. My guess is: origin_count would be the number of floats in a single block, origin_datatype would be MPI_FLOAT, target_rank = 0, target_disp = 0, target_count = 1, target_datatype = my subarray datatype. Would that be correct?
>
> Thanks,
>
> Matthieu
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list