[mpich-discuss] reading from a read only directory

Geoffrey Irving irving at naml.us
Tue Dec 3 14:58:53 CST 2013

On Tue, Dec 3, 2013 at 8:20 AM, Rob Latham <robl at mcs.anl.gov> wrote:
> On Sat, Nov 30, 2013 at 08:35:39AM -0600, Jed Brown wrote:
>> Geoffrey Irving <irving at naml.us> writes:
>> > It's not an issue for me personally anymore: chmod +w . is fine, and
>> > symlinks would also solve the problem.  The question was more about
>> > the motivation for the code and whether it's an issue anything thinks
>> > is worth fixing (or fixable at all).  That and Jed asked me several
>> > times to start the thread. :)
>> It's not intuitive and does not appear to be documented.  Reading from a
>> read-only directory is a reasonable thing to try and people should not
>> have to wait through a queue to find out that it doesn't work.
> Mea culpa.  MPI_File_read_ordered is implemented in "stupid mode"
> right now, and no one ever complained -- we don't pay much attention
> to shared file pointers.  Tell me more about how you are relying on
> ordered mode, please.

I'm dividing a large file into nearly equal size chunks and slurping
the entire file into RAM.  Then I do a bunch of rearrangement in
memory.  I could just as easily use MPI_File_read_at_all, but had
assumed that ordered was better because my known intervals happen to
be ordered and contiguous.  Should I be using MPI_File_read_at_all
instead, and treating the ordered routine as vestigial?

> The technical issues here: the shared file pointer and ordered mode
> collectives keep as global state the location of the shared file
> pointer.  Where does it keep this state?  Why, in the file system of
> course.  Because using a file system as a message-passing medium is
> exactly what file systems are good at.... uh, well, maybe not, but
> it's the most portable option.
> You can also do this with RMA windows (Rob Ross and I published some
> papers on this topic back in 05 and 06).  Has a few drawbacks but it's
> got to be better than the file system.
> Even without the RMA windows, we can still do a better job for ordered
> mode with MPI_SCAN -- it's just we need some globally-available place
> to get the old shared file pointer value and stash the new one.
> In user-space, you'll outperform MPI-IO implementations if you manage
> the offset yourself.

Okay, I was basically assuming that if I used a collective routine, it
would be smart about accessing the file system as few times as
possible.  If this isn't true I should unqueue my job now and switch
to MPI_File_read_at_all.  Is this right?  Should I also use
MPI_File_write_at_all instead of the shared fp collective write

> I wrote a little ordered mode shim you can link into your program.
> Looks like I last seriously did anything with it back in 2008 ?  I
> cleaned up some of the warnings and made it use MPI_Count (remove
> HAVE_MPI_TYPE_SIZE_X from the makefile if you're on an old system).
> Caveat: I think the last time I ran this code was on Argonne's Blue
> Gene /L.  Bear in mind the adage about tested code and broken code.

Thanks, though in my case switching away from the shared file pointer
routine is easy, so I don't need the shim.


More information about the discuss mailing list