[mpich-devel] ROMIO collective i/o memory use
Bob Cernohous
bobc at us.ibm.com
Mon Apr 29 10:28:01 CDT 2013
A customer (Argonne ;) is complaining about O(p) allocations in collective
i/o. A collective read is failing at larger scale.
Any thoughts or comments or advice? There appears to be lots of O(p) in
ROMIO collective I/O. Plus a lot of (possibly large) aggregated data
buffers. A quick search shows
The common ROMIO read collective code:
Find all "ADIOI_Malloc", Match case, Regular expression (UNIX)
File Z:\bgq\comm\lib\dev\mpich2\src\mpi\romio\adio\common\ad_read_coll.c
124 38: st_offsets = (ADIO_Offset *)
ADIOI_Malloc(nprocs*sizeof(ADIO_Offset));
125 39: end_offsets = (ADIO_Offset *)
ADIOI_Malloc(nprocs*sizeof(ADIO_Offset));
317 44: *offset_list_ptr = (ADIO_Offset *)
ADIOI_Malloc(2*sizeof(ADIO_Offset));
318 41: *len_list_ptr = (ADIO_Offset *)
ADIOI_Malloc(2*sizeof(ADIO_Offset));
334 44: *offset_list_ptr = (ADIO_Offset *)
ADIOI_Malloc(2*sizeof(ADIO_Offset));
335 41: *len_list_ptr = (ADIO_Offset *)
ADIOI_Malloc(2*sizeof(ADIO_Offset));
436 18: ADIOI_Malloc((contig_access_count+1)*sizeof(ADIO_Offset));
437 41: *len_list_ptr = (ADIO_Offset *)
ADIOI_Malloc((contig_access_count+1)*sizeof(ADIO_Offset));
573 37: if (ntimes) read_buf = (char *) ADIOI_Malloc(coll_bufsize);
578 21: count = (int *) ADIOI_Malloc(nprocs * sizeof(int));
587 25: send_size = (int *) ADIOI_Malloc(nprocs * sizeof(int));
590 25: recv_size = (int *) ADIOI_Malloc(nprocs * sizeof(int));
598 25: start_pos = (int *) ADIOI_Malloc(nprocs*sizeof(int));
739 32: tmp_buf = (char *) ADIOI_Malloc(for_next_iter);
744 33: read_buf = (char *)
ADIOI_Malloc(for_next_iter+coll_bufsize);
805 9: ADIOI_Malloc((nprocs_send+nprocs_recv+1)*sizeof(MPI_Request));
827 30: recv_buf = (char **) ADIOI_Malloc(nprocs * sizeof(char*));
830 44: (char *)
ADIOI_Malloc(recv_size[i]);
870 31: statuses = (MPI_Status *)
ADIOI_Malloc((nprocs_send+nprocs_recv+1) * \
988 35: curr_from_proc = (unsigned *) ADIOI_Malloc(nprocs *
sizeof(unsigned));
989 35: done_from_proc = (unsigned *) ADIOI_Malloc(nprocs *
sizeof(unsigned));
990 35: recv_buf_idx = (unsigned *) ADIOI_Malloc(nprocs *
sizeof(unsigned));
Total found: 22
Our BG version of read collective:
File Z:\bgq\comm\lib\dev\mpich2\src\mpi\romio\adio\ad_bg\ad_bg_rdcoll.c
179 40: st_offsets = (ADIO_Offset *)
ADIOI_Malloc(nprocs*sizeof(ADIO_Offset));
180 40: end_offsets = (ADIO_Offset *)
ADIOI_Malloc(nprocs*sizeof(ADIO_Offset));
183 43: bg_offsets0 = (ADIO_Offset *)
ADIOI_Malloc(2*nprocs*sizeof(ADIO_Offset));
184 43: bg_offsets = (ADIO_Offset *)
ADIOI_Malloc(2*nprocs*sizeof(ADIO_Offset));
475 37: if (ntimes) read_buf = (char *) ADIOI_Malloc(coll_bufsize);
480 21: count = (int *) ADIOI_Malloc(nprocs * sizeof(int));
489 25: send_size = (int *) ADIOI_Malloc(nprocs * sizeof(int));
492 25: recv_size = (int *) ADIOI_Malloc(nprocs * sizeof(int));
500 25: start_pos = (int *) ADIOI_Malloc(nprocs*sizeof(int));
676 32: tmp_buf = (char *) ADIOI_Malloc(for_next_iter);
681 33: read_buf = (char *)
ADIOI_Malloc(for_next_iter+coll_bufsize);
761 9: ADIOI_Malloc((nprocs_send+nprocs_recv+1)*sizeof(MPI_Request));
783 30: recv_buf = (char **) ADIOI_Malloc(nprocs * sizeof(char*));
786 44: (char *)
ADIOI_Malloc(recv_size[i]);
826 31: statuses = (MPI_Status *)
ADIOI_Malloc((nprocs_send+nprocs_recv+1) * \
944 35: curr_from_proc = (unsigned *) ADIOI_Malloc(nprocs *
sizeof(unsigned));
945 35: done_from_proc = (unsigned *) ADIOI_Malloc(nprocs *
sizeof(unsigned));
946 35: recv_buf_idx = (unsigned *) ADIOI_Malloc(nprocs *
sizeof(unsigned));
1058 23: rdispls = (int *) ADIOI_Malloc( nprocs * sizeof(int) );
1063 29: all_recv_buf = (char *) ADIOI_Malloc( rtail );
1064 26: recv_buf = (char **) ADIOI_Malloc(nprocs * sizeof(char *));
1068 23: sdispls = (int *) ADIOI_Malloc( nprocs * sizeof(int) );
1073 29: all_send_buf = (char *) ADIOI_Malloc( stail );
Total found: 23
Bob Cernohous: (T/L 553) 507-253-6093
BobC at us.ibm.com
IBM Rochester, Building 030-2(C335), Department 61L
3605 Hwy 52 North, Rochester, MN 55901-7829
> Chaos reigns within.
> Reflect, repent, and reboot.
> Order shall return.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/devel/attachments/20130429/16833c1b/attachment.html>
More information about the devel
mailing list