[mpich-devel] ROMIO collective i/o memory use
Bob Cernohous
bobc at us.ibm.com
Mon May 6 13:41:07 CDT 2013
I agree and suggested:
---------------------
It appears they don't have enough memory for an alltoallv exchange. Try
'1'...
* - BGMPIO_COMM - Define how data is exchanged on collective
* reads and writes. Possible values:
* - 0 - Use MPI_Alltoallv.
* - 1 - Use MPI_Isend/MPI_Irecv.
* - Default is 0.
---------------------
but they didn't want a work around they wanted a 'fix for o(p)
allocations'. There are o(p) allocations all over collective i/o from a
quick glance. Just wanted some input from the experts about scaling
romio. I haven't heard if the suggestion worked.
Bob Cernohous: (T/L 553) 507-253-6093
BobC at us.ibm.com
IBM Rochester, Building 030-2(C335), Department 61L
3605 Hwy 52 North, Rochester, MN 55901-7829
> Chaos reigns within.
> Reflect, repent, and reboot.
> Order shall return.
devel-bounces at mpich.org wrote on 05/04/2013 09:43:10 PM:
> From: "Rob Latham" <robl at mcs.anl.gov>
> To: devel at mpich.org,
> Cc: mpich2-dev at mcs.anl.gov
> Date: 05/04/2013 09:48 PM
> Subject: Re: [mpich-devel] ROMIO collective i/o memory use
> Sent by: devel-bounces at mpich.org
>
> On Mon, Apr 29, 2013 at 10:28:01AM -0500, Bob Cernohous wrote:
> > A customer (Argonne ;) is complaining about O(p) allocations in
collective
> > i/o. A collective read is failing at larger scale.
> >
> > Any thoughts or comments or advice? There appears to be lots of O(p)
in
> > ROMIO collective I/O. Plus a lot of (possibly large) aggregated data
> > buffers. A quick search shows
>
> The O(p) allocations are a concern, sure. For two-phase, though, the
> real problem lies in ADIOI_R_Exchange_data_alltoallv and
> ADIOI_W_Exchange_data_alltoallv . The O(p) allocations are the least
> of our worries!
>
> around line 1063 of ad_bg_rdcoll.c
>
> all_recv_buf = (char *) ADIOI_Malloc( rtail );
>
> all_send_buf = (char *) ADIOI_Malloc( stail );
>
> (rtail and stail are the sum of the receive and send arrays)
>
> ==rob
>
> > The common ROMIO read collective code:
> >
> > Find all "ADIOI_Malloc", Match case, Regular expression (UNIX)
> >
> > File
Z:\bgq\comm\lib\dev\mpich2\src\mpi\romio\adio\common\ad_read_coll.c
> >
> > 124 38: st_offsets = (ADIO_Offset *)
> > ADIOI_Malloc(nprocs*sizeof(ADIO_Offset));
> >
> > 125 39: end_offsets = (ADIO_Offset *)
> > ADIOI_Malloc(nprocs*sizeof(ADIO_Offset));
> >
> > 317 44: *offset_list_ptr = (ADIO_Offset *)
> > ADIOI_Malloc(2*sizeof(ADIO_Offset));
> >
> > 318 41: *len_list_ptr = (ADIO_Offset *)
> > ADIOI_Malloc(2*sizeof(ADIO_Offset));
> >
> > 334 44: *offset_list_ptr = (ADIO_Offset *)
> > ADIOI_Malloc(2*sizeof(ADIO_Offset));
> >
> > 335 41: *len_list_ptr = (ADIO_Offset *)
> > ADIOI_Malloc(2*sizeof(ADIO_Offset));
> >
> > 436 18: ADIOI_Malloc((contig_access_count+1)*sizeof(ADIO_Offset));
> >
> > 437 41: *len_list_ptr = (ADIO_Offset *)
> > ADIOI_Malloc((contig_access_count+1)*sizeof(ADIO_Offset));
> >
> > 573 37: if (ntimes) read_buf = (char *)
ADIOI_Malloc(coll_bufsize);
> >
> > 578 21: count = (int *) ADIOI_Malloc(nprocs * sizeof(int));
> >
> > 587 25: send_size = (int *) ADIOI_Malloc(nprocs * sizeof(int));
> >
> > 590 25: recv_size = (int *) ADIOI_Malloc(nprocs * sizeof(int));
> >
> > 598 25: start_pos = (int *) ADIOI_Malloc(nprocs*sizeof(int));
> >
> > 739 32: tmp_buf = (char *) ADIOI_Malloc(for_next_iter);
> >
> > 744 33: read_buf = (char *)
> > ADIOI_Malloc(for_next_iter+coll_bufsize);
> >
> > 805 9:
ADIOI_Malloc((nprocs_send+nprocs_recv+1)*sizeof(MPI_Request));
> >
> > 827 30: recv_buf = (char **) ADIOI_Malloc(nprocs *
sizeof(char*));
> >
> > 830 44: (char *)
> > ADIOI_Malloc(recv_size[i]);
> >
> > 870 31: statuses = (MPI_Status *)
> > ADIOI_Malloc((nprocs_send+nprocs_recv+1) * \
> >
> > 988 35: curr_from_proc = (unsigned *) ADIOI_Malloc(nprocs *
> > sizeof(unsigned));
> >
> > 989 35: done_from_proc = (unsigned *) ADIOI_Malloc(nprocs *
> > sizeof(unsigned));
> >
> > 990 35: recv_buf_idx = (unsigned *) ADIOI_Malloc(nprocs *
> > sizeof(unsigned));
> >
> > Total found: 22
> >
> >
> > Our BG version of read collective:
> >
> > File
Z:\bgq\comm\lib\dev\mpich2\src\mpi\romio\adio\ad_bg\ad_bg_rdcoll.c
> >
> > 179 40: st_offsets = (ADIO_Offset *)
> > ADIOI_Malloc(nprocs*sizeof(ADIO_Offset));
> >
> > 180 40: end_offsets = (ADIO_Offset *)
> > ADIOI_Malloc(nprocs*sizeof(ADIO_Offset));
> >
> > 183 43: bg_offsets0 = (ADIO_Offset *)
> > ADIOI_Malloc(2*nprocs*sizeof(ADIO_Offset));
> >
> > 184 43: bg_offsets = (ADIO_Offset *)
> > ADIOI_Malloc(2*nprocs*sizeof(ADIO_Offset));
> >
> > 475 37: if (ntimes) read_buf = (char *)
ADIOI_Malloc(coll_bufsize);
> >
> > 480 21: count = (int *) ADIOI_Malloc(nprocs * sizeof(int));
> >
> > 489 25: send_size = (int *) ADIOI_Malloc(nprocs * sizeof(int));
> >
> > 492 25: recv_size = (int *) ADIOI_Malloc(nprocs * sizeof(int));
> >
> > 500 25: start_pos = (int *) ADIOI_Malloc(nprocs*sizeof(int));
> >
> > 676 32: tmp_buf = (char *) ADIOI_Malloc(for_next_iter);
> >
> > 681 33: read_buf = (char *)
> > ADIOI_Malloc(for_next_iter+coll_bufsize);
> >
> > 761 9:
ADIOI_Malloc((nprocs_send+nprocs_recv+1)*sizeof(MPI_Request));
> >
> > 783 30: recv_buf = (char **) ADIOI_Malloc(nprocs *
sizeof(char*));
> >
> > 786 44: (char *)
> > ADIOI_Malloc(recv_size[i]);
> >
> > 826 31: statuses = (MPI_Status *)
> > ADIOI_Malloc((nprocs_send+nprocs_recv+1) * \
> >
> > 944 35: curr_from_proc = (unsigned *) ADIOI_Malloc(nprocs *
> > sizeof(unsigned));
> >
> > 945 35: done_from_proc = (unsigned *) ADIOI_Malloc(nprocs *
> > sizeof(unsigned));
> >
> > 946 35: recv_buf_idx = (unsigned *) ADIOI_Malloc(nprocs *
> > sizeof(unsigned));
> >
> > 1058 23: rdispls = (int *) ADIOI_Malloc( nprocs * sizeof(int) );
> >
> > 1063 29: all_recv_buf = (char *) ADIOI_Malloc( rtail );
> >
> > 1064 26: recv_buf = (char **) ADIOI_Malloc(nprocs * sizeof(char
*));
> >
> > 1068 23: sdispls = (int *) ADIOI_Malloc( nprocs * sizeof(int) );
> >
> > 1073 29: all_send_buf = (char *) ADIOI_Malloc( stail );
> >
> > Total found: 23
> >
> >
> > Bob Cernohous: (T/L 553) 507-253-6093
> >
> > BobC at us.ibm.com
> > IBM Rochester, Building 030-2(C335), Department 61L
> > 3605 Hwy 52 North, Rochester, MN 55901-7829
> >
> > > Chaos reigns within.
> > > Reflect, repent, and reboot.
> > > Order shall return.
>
> --
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/devel/attachments/20130506/64b0f701/attachment-0002.html>
More information about the devel
mailing list