[mpich-discuss] Need perspective on use of atomic I/O

Latham, Robert J. robl at mcs.anl.gov
Tue May 2 16:36:20 CDT 2017

On Sun, 2017-04-30 at 17:16 -0700, Ted Dunning wrote:
> I am trying to assess how well a high-performance file system that I
> work on will support MPI programming for some customers we have. In
> particular, I am looking at MPI I/O.
> The technical aspects of the assessment are pretty simple. Collective
> and direct I/O should work great, but the locking apparently required
> for atomic mode I/O will be very, very expensive.
> That leads to the community questions of how important these trade-
> offs will be. And thus, I land here on this list.
> I have a few questions that folk here probably can answer off the top
> of their head:
> 1) how often is atomic mode I/O used? It seems that most I/O will
> involve non-overlapping writes and thus atomic mode will be
> irrelevant. But what is the real sense from people who actually write
> this kind of code?

atomic mode is rarely used in practice and sometimes not even supported
by the underlying file system.   Where needed, applications are in a
better position to know when to synchronize and coordinate.

> 2) do commonly used MPI libraries implement atomic mode internally as
> in [1], or do they require file system support? What kind of support?

the file system needs to support fcntl() locks.  Lustre, for example,
only supports fcntl() locks with a special mount option.

> 3) how to file-system people usually go wrong in approaching MPI
> applications?

A lot of parallel file systems assume dumb clients, but MPI clients
have a lot more information: they know, via the MPI communicator, who
is participating in this I/O operation.  They also follow looser
consistency semantics.  Whereas POSIX demands bytes written are visible
immediately, MPI-IO semantics demand that bytes written are visible at
synchronization points. 

POSIX is everywhere of course so one should provide a POSIX interface. 
Providing that interface on top of a more parallel I/O friendly
interface is a lot easier than going the other way around.

> [1] http://www.mcs.anl.gov/~thakur/papers/atomic-mode.pdf

hey, I worked on that!  You have really done your homework. While we
still think it's a good approach, we've never deployed it nor do I
think anyone else has.   Furthermore, a lot of the cleverness is no
longer needed now that MPI-3 RMA supports  fetch-and-increment.

discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:

More information about the discuss mailing list