[mpich-discuss] From file reading to memory sharing

Zhao, Xin xinzhao3 at illinois.edu
Wed Aug 12 14:34:56 CDT 2015


Hi Matthieu,

If you are referring to the requests used by RMA operations, there is a CVAR (MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) in MPICH which can control number of active RMA requests. Basically what it does is if number of active RMA requests reaches that value, RMA operation will be blockingly waiting until number of active requests is reduced.

If you want to tune that CVAR, I suggest you using the latest MPICH (http://www.mpich.org/static/downloads/nightly/master/mpich/), because we recently modify the default value of it. 

Thanks,
Xin

________________________________________
From: Dorier, Matthieu [mdorier at anl.gov]
Sent: Wednesday, August 12, 2015 12:36 PM
To: discuss at mpich.org
Subject: Re: [mpich-discuss] From file reading to memory sharing

Hi,

I'm gessing the cause of the performance issue is the large number of small requests, because when I read the full 7GB using a single process issuing one single MPI_File_read, it gets much better.

Matthieu
________________________________________
From: Wei-keng Liao [wkliao at eecs.northwestern.edu]
Sent: Wednesday, August 12, 2015 11:56 AM
To: discuss at mpich.org
Subject: Re: [mpich-discuss] From file reading to memory sharing

Hi, Matthieu

If you have "a series of subdomains (blocks) to load from the file",
one way to get a better I/O performance is to concatenate the filetypes
into a single one and then use only one MPI_File_set_view and one
MPI_File_read_all to read the data.

FYI. This request aggregation strategy is also used in PnetCDF.

Wei-keng

On Aug 12, 2015, at 11:27 AM, Dorier, Matthieu wrote:

> Hi,
>
> I'm trying to refactor an MPI code using MPI one-sided communications.
>
> The initial version of the code reads its data from a file containing a 3D array of floats. Each process has a series of subdomains (blocks) to load from the file, so they all open the file and then issue a series of MPI_File_set_view and MPI_File_read. The type passed to MPI_File_set_view is constructed using MPI_Type_create_subarray to match the block that needs to be loaded.
>
> This code performs very poorly even at small scale: the file is 7GB but the blocks are a few hundreds of bytes, and each process has many blocks to load.
>
> Instead, I would like to have process rank 0 load the entire file, then expose it over RMA. I'm not familiar at all with MPI one-sided operations, since I never used them before, but I guess there should be a simple way to reuse the subarray datatype of my MPI_File_set_view and use it in the context of an MPI_Get. I'm just not sure what the arguments of this MPI_Get would be. My guess is: origin_count would be the number of floats in a single block, origin_datatype would be MPI_FLOAT, target_rank = 0, target_disp = 0, target_count = 1, target_datatype = my subarray datatype. Would that be correct?
>
> Thanks,
>
> Matthieu
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list