[mpich-discuss] very slow file writes independent of file size

Geoffrey Irving irving at naml.us
Mon Mar 3 00:21:46 CST 2014


I'm doing postmortem on a 2048 node (16384 rank) job on Edison, and
trying to understand why my I/O performance might have been slow.

Here's the data:

Measured I/O bandwidth:
slice 35 write sparse bandwidth = 6082640 / (3.52519e+06 s / 16384) =
2.63287e-05 GB/s
slice 34 write sparse bandwidth = 13824080 / (3.66608e+06 s / 16384) =
5.75379e-05 GB/s
slice 33 write sparse bandwidth = 24754256 / (2.83647e+06 s / 16384) =
0.000133166 GB/s
slice 32 write sparse bandwidth = 39370832 / (3.47016e+06 s / 16384) =
0.000173119 GB/s
slice 31 write sparse bandwidth = 55812176 / (2.53623e+06 s / 16384) =
0.000335785 GB/s
slice 30 write sparse bandwidth = 74741840 / (2.5714e+06 s / 16384) =
0.00044352 GB/s
slice 29 write sparse bandwidth = 93560912 / (2.67336e+06 s / 16384) =
0.000534019 GB/s
slice 28 write sparse bandwidth = 112803920 / (2.74639e+06 s / 16384)
= 0.000626733 GB/s
slice 27 write sparse bandwidth = 128194640 / (3.1603e+06 s / 16384) =
0.000618958 GB/s
slice 26 write sparse bandwidth = 141281360 / (3.12754e+06 s / 16384)
= 0.00068929 GB/s
slice 25 write sparse bandwidth = 148193360 / (2.62376e+06 s / 16384)
= 0.000861835 GB/s
slice 24 write sparse bandwidth = 151861328 / (3.2145e+06 s / 16384) =
0.000720865 GB/s
slice 23 write sparse bandwidth = 148193360 / (2.44736e+06 s / 16384)
= 0.000923956 GB/s
slice 22 write sparse bandwidth = 142055504 / (3.15962e+06 s / 16384)
= 0.000686031 GB/s
slice 21 write sparse bandwidth = 130388048 / (3.09774e+06 s / 16384)
= 0.000642263 GB/s
slice 20 write sparse bandwidth = 117964880 / (3.02676e+06 s / 16384)
= 0.000594696 GB/s
slice 19 write sparse bandwidth = 101560400 / (2.97198e+06 s / 16384)
= 0.000521434 GB/s
slice 18 write sparse bandwidth = 86372432 / (2.96247e+06 s / 16384) =
0.000444878 GB/s
slice 18 write sections bandwidth = 1954957518434 / (1.83937e+07 s /
16384) = 1.62177 GB/s
slice 17 write sparse bandwidth = 70170704 / (2.88973e+06 s / 16384) =
0.000370526 GB/s
slice 17 write sections bandwidth = 1475380615039 / (1.36018e+07 s /
16384) = 1.65511 GB/s
slice 17 read bandwidth (192 nodes) = 1475380615039 / 3383.36 s = 0.406122 GB/s
  per node: measured = 2.16598 MB/s, theoretical peak = 33.1341 MB/s

Focusing on the "sparse" lines, the main point is that the time seems
to be roughly independent of file size (plot attached).  Each timing
sample consists of (1) setup which I believe is negligible, (2)
MPI_File_open, (3) MPI_File_write_ordered, (4) MPI_File_close.

What might have caused these file writes to take so long?

Geoffrey
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2014-03-02-221950_1220x500.png
Type: image/png
Size: 21072 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140302/f2faefc7/attachment.png>


More information about the discuss mailing list