[mpich-devel] ROMIO collective i/o memory use

Jeff Hammond jhammond at alcf.anl.gov
Mon Apr 29 15:25:51 CDT 2013


I believe that I am on record as complaining vigorously about O(nproc)
allocations in BG-MPICH.  However, my first rule of parallel IO is to
not do IO, so you will not hear me complain about this specific set of
issues unless they are due to O(nproc) allocations inside of MPI
collectives.

Jeff

On Mon, Apr 29, 2013 at 10:28 AM, Bob Cernohous <bobc at us.ibm.com> wrote:
> A customer (Argonne ;) is complaining about O(p) allocations in collective
> i/o.  A collective read is failing at larger scale.
>
> Any thoughts or comments or advice?   There appears to be lots of O(p) in
> ROMIO collective I/O.  Plus a lot of (possibly large) aggregated data
> buffers.  A quick search shows
>
> The common ROMIO read collective code:
>
> Find all "ADIOI_Malloc", Match case, Regular expression (UNIX)
> File Z:\bgq\comm\lib\dev\mpich2\src\mpi\romio\adio\common\ad_read_coll.c
>   124 38:        st_offsets = (ADIO_Offset *)
> ADIOI_Malloc(nprocs*sizeof(ADIO_Offset));
>   125 39:        end_offsets = (ADIO_Offset *)
> ADIOI_Malloc(nprocs*sizeof(ADIO_Offset));
>   317 44:        *offset_list_ptr = (ADIO_Offset *)
> ADIOI_Malloc(2*sizeof(ADIO_Offset));
>   318 41:        *len_list_ptr = (ADIO_Offset *)
> ADIOI_Malloc(2*sizeof(ADIO_Offset));
>   334 44:        *offset_list_ptr = (ADIO_Offset *)
> ADIOI_Malloc(2*sizeof(ADIO_Offset));
>   335 41:        *len_list_ptr = (ADIO_Offset *)
> ADIOI_Malloc(2*sizeof(ADIO_Offset));
>   436 18:
> ADIOI_Malloc((contig_access_count+1)*sizeof(ADIO_Offset));
>   437 41:        *len_list_ptr = (ADIO_Offset *)
> ADIOI_Malloc((contig_access_count+1)*sizeof(ADIO_Offset));
>   573 37:    if (ntimes) read_buf = (char *) ADIOI_Malloc(coll_bufsize);
>   578 21:    count = (int *) ADIOI_Malloc(nprocs * sizeof(int));
>   587 25:    send_size = (int *) ADIOI_Malloc(nprocs * sizeof(int));
>   590 25:    recv_size = (int *) ADIOI_Malloc(nprocs * sizeof(int));
>   598 25:    start_pos = (int *) ADIOI_Malloc(nprocs*sizeof(int));
>   739 32:            tmp_buf = (char *) ADIOI_Malloc(for_next_iter);
>   744 33:            read_buf = (char *)
> ADIOI_Malloc(for_next_iter+coll_bufsize);
>   805 9:
> ADIOI_Malloc((nprocs_send+nprocs_recv+1)*sizeof(MPI_Request));
>   827 30:        recv_buf = (char **) ADIOI_Malloc(nprocs * sizeof(char*));
>   830 44:                                  (char *)
> ADIOI_Malloc(recv_size[i]);
>   870 31:    statuses = (MPI_Status *)
> ADIOI_Malloc((nprocs_send+nprocs_recv+1) * \
>   988 35:    curr_from_proc = (unsigned *) ADIOI_Malloc(nprocs *
> sizeof(unsigned));
>   989 35:    done_from_proc = (unsigned *) ADIOI_Malloc(nprocs *
> sizeof(unsigned));
>   990 35:    recv_buf_idx   = (unsigned *) ADIOI_Malloc(nprocs *
> sizeof(unsigned));
> Total found: 22
>
> Our BG version of read collective:
>
> File Z:\bgq\comm\lib\dev\mpich2\src\mpi\romio\adio\ad_bg\ad_bg_rdcoll.c
>   179 40:        st_offsets   = (ADIO_Offset *)
> ADIOI_Malloc(nprocs*sizeof(ADIO_Offset));
>   180 40:        end_offsets  = (ADIO_Offset *)
> ADIOI_Malloc(nprocs*sizeof(ADIO_Offset));
>   183 43:            bg_offsets0 = (ADIO_Offset *)
> ADIOI_Malloc(2*nprocs*sizeof(ADIO_Offset));
>   184 43:            bg_offsets  = (ADIO_Offset *)
> ADIOI_Malloc(2*nprocs*sizeof(ADIO_Offset));
>   475 37:    if (ntimes) read_buf = (char *) ADIOI_Malloc(coll_bufsize);
>   480 21:    count = (int *) ADIOI_Malloc(nprocs * sizeof(int));
>   489 25:    send_size = (int *) ADIOI_Malloc(nprocs * sizeof(int));
>   492 25:    recv_size = (int *) ADIOI_Malloc(nprocs * sizeof(int));
>   500 25:    start_pos = (int *) ADIOI_Malloc(nprocs*sizeof(int));
>   676 32:            tmp_buf = (char *) ADIOI_Malloc(for_next_iter);
>   681 33:            read_buf = (char *)
> ADIOI_Malloc(for_next_iter+coll_bufsize);
>   761 9:
> ADIOI_Malloc((nprocs_send+nprocs_recv+1)*sizeof(MPI_Request));
>   783 30:        recv_buf = (char **) ADIOI_Malloc(nprocs * sizeof(char*));
>   786 44:                                  (char *)
> ADIOI_Malloc(recv_size[i]);
>   826 31:    statuses = (MPI_Status *)
> ADIOI_Malloc((nprocs_send+nprocs_recv+1) * \
>   944 35:    curr_from_proc = (unsigned *) ADIOI_Malloc(nprocs *
> sizeof(unsigned));
>   945 35:    done_from_proc = (unsigned *) ADIOI_Malloc(nprocs *
> sizeof(unsigned));
>   946 35:    recv_buf_idx   = (unsigned *) ADIOI_Malloc(nprocs *
> sizeof(unsigned));
>   1058 23:    rdispls = (int *) ADIOI_Malloc( nprocs * sizeof(int) );
>   1063 29:    all_recv_buf = (char *) ADIOI_Malloc( rtail );
>   1064 26:    recv_buf = (char **) ADIOI_Malloc(nprocs * sizeof(char *));
>   1068 23:    sdispls = (int *) ADIOI_Malloc( nprocs * sizeof(int) );
>   1073 29:    all_send_buf = (char *) ADIOI_Malloc( stail );
> Total found: 23
>
>
> Bob Cernohous:  (T/L 553) 507-253-6093
>
> BobC at us.ibm.com
> IBM Rochester, Building 030-2(C335), Department 61L
> 3605 Hwy 52 North, Rochester,  MN 55901-7829
>
>> Chaos reigns within.
>> Reflect, repent, and reboot.
>> Order shall return.



-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
ALCF docs: http://www.alcf.anl.gov/user-guides


More information about the devel mailing list