[mpich-devel] [mpich2-dev] And yet another ROMIO performance question

Rob Latham robl at mcs.anl.gov
Thu Oct 17 10:32:40 CDT 2013


On Wed, Sep 16, 2009 at 02:57:45PM -0500, Bob Cernohous wrote:
> Same customer noticed that the file system was doing much more i/o than 
> MPIIO on collective reads.  

Hi Bob.  Did you ever get an acceptable answer to this 4 year old
question? 

Have you ever had anyone respond to a 4 year old message?  Let's see
how well lotus notes handles /that/.

Sorry it took me so long to give you an answer.  By the time I figured
out what was going on, I forgot you needed an answer.  In my meager
defense, my daughter was 6 weeks old back then.... 

What happens is two-phase implements data sieving.   Normally, this is
good.  On Blue Gene, we turn on two-phase all the time (the hint is
"enable", not "automatic"), even when accesses are not overlapping.

So, consider a file domain where the first and last bytes are written.
(e.g. a parallel-netcdf application is reading from a file with record
variables).  ROMIO will issue one read for the entire file domain. 

Writes are even worse:  ROMIO will read in the entire file domain,
modify the regions, and write it out.

Not sure the best way to fix this. Probably stick in a few peephole
optimizations, like if there's only one access in a file domain, just
go ahead and issue that one access.  

A "real" fix would have to build up a basic performance model so we
could answer the question of where the tradeoff of "big request with
extra wasted data" beats "many small requests, but no wasted data".

==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA


More information about the devel mailing list