[mpich-discuss] About mpi i/o, mpi hints and collective i/o memory usage

pramod kumbhar pramod.s.kumbhar at gmail.com
Fri Apr 21 09:13:50 CDT 2017


Dear All,


I would like to understand some details about MPI I/O hints on bg-q and out
of memory error while doing collective i/o.


Quick Summary :


1. On bg-q I see cb_buffer_size as 16MB when we query on file handle using
MPI_File_get_info.

An application that we are looking at has code section like:


….

MPI_File_set_view( fh, position_to_write, MPI_FLOAT, mappingType, _native_,
MPI_INFO_NULL );

max_mb_on_any_rank_using_Kernel_GetMemorySize () => 275 MB

MPI_File_write_all( fh, mappingBuffer, .................... MPI_FLOAT,
&status);

max_mb_on_any_rank_using_Kernel_GetMemorySize () => 373 MB

……


Why we see that spike in memory usage?  (see Detail section for size
information)


I have seen “Kernel_GetMemorySize(KERNEL_MEMSIZE_HEAP….)” not returning
accurate memory footprint but I am not sure if that is the case here.

Darshan screenshot attached shows the access sizes while running on 4 rack.


2. Is romio_cb_alltoall ignored on bg-q? Even if I disable it, I see
“automatic” in the output.


(I am looking at
srcV1R2M4/comm/lib/dev/mpich2/src/mpi/romio/adio/ad_bg/ad_bg_hints.c and
see the code section is commented.)


More Details :


We are debugging an application on MIRA which runs on 1,2,4 racks but fails
at 8 racks while dumping a custom checkpoint. These are strong scaling runs
and the size of checkpoint remains same (~172GB). 32 ranks per mode. Max
memory usage before start of checkpoint (i.e. before single write_all call)

for 8 rack is ~ 300 MB. The checkpoint size from each rank is between Kbs
to few MBs (as shown by darshan). Once application call checkpoint, we see
below error :


  Out of memory in file
/bgsys/source/srcV1R2M2.15270/comm/lib/dev/mpich2/src/mpi/romio/adio/ad_bg/ad_bg_wrcoll.c,
    line 500


And hence I am confused about behaviour mentioned in question 1.

If someone has any insight, it will be great help!


Regards,

Pramod


p.s.


Default values of all hints


cb_buffer_size, value = 16777216

romio_cb_read, value = enable

romio_cb_write, value = enable

cb_nodes, value = 8320             (change based on partition size)

romio_no_indep_rw, value = false

romio_cb_pfr, value = disable

romio_cb_fr_types, value = aar

romio_cb_fr_alignment, value = 1

romio_cb_ds_threshold, value = 0

romio_cb_alltoall, value = automatic

ind_rd_buffer_size, value = 4194304

romio_ds_read, value = automatic

romio_ds_write, value = disable
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20170421/9e597af0/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: darshan_first_4rack_default.png
Type: image/png
Size: 42130 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20170421/9e597af0/attachment.png>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list