<div dir="ltr"><div class="gmail_quote"><div dir="ltr"><div><font face="arial, helvetica, sans-serif">Hello everyone,</font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">I am currently working on a partitioned collective I/O implementation for a EU project. The idea is to partition the accessed file range into disjoint access regions (i.e., regions that do not overlap with each other) and assign processes from these regions to independent communicators created by splitting the original communicator with MPI_Comm_split. Very much like it is done in the ParColl paper or in the Memory Conscious Collective I/O paper. </font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">The code is pretty simple. It just uses the access pattern information to create the new communicators, assign aggregators to each of them and then compute file domains and data dependencies for every process using default functions from ROMIO (i.e. ADIOI_Calc_file_domains, ADIOI_Calc_my_req, ADIOI_Calc_others_req). </font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">I am currently testing my implementation using coll_perf with 512 processes. Every process writes 64MB of data for a total of 32GBs (no reads are performed). The ranks belonging to non-overlapping regions (of file areas) are printed out by the process with rank 0 in the global communicator:</font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] file_area_count = 8</font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] file_area_ranklist[0] = 0 1 2 3 ... 63</font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] file_area_ranklist[1] = 64 65 66 67 ... 127</font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] file_area_ranklist[2] = 128 129 130 131 ... 191</font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] file_area_ranklist[3] = 192 193 194 195 ... 255<br></font></div></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] file_area_ranklist[4] = 256 257 258 259 ... 319 <br></font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] file_area_ranklist[5] = 320 321 322 323 ... 383<br></font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] file_area_ranklist[6] = 384 385 386 387 ... 447<br></font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] file_area_ranklist[7] = 448 449 450 451 ... 511<br></font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">Afterwards, every aggregator in each communication group prints its rank and aggregator number in ranklist:</font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] my_cb_nodes_index = 0</font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] fd->hints->ranklist[0] = 0</font></div><div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] my_cb_nodes_index = 0</font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] fd->hints->ranklist[0] = 0</font></div></div><div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] my_cb_nodes_index = 0</font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] fd->hints->ranklist[0] = 0</font></div></div><div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] my_cb_nodes_index = 0</font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_aggregate.c:0] fd->hints->ranklist[0] = 0</font></div></div><div><font face="arial, helvetica, sans-serif"><br></font></div></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">In this particular case I have set fd->hints->cb_nodes to 4. This means that there are 4 aggregators but 8 non-overlapping regions (file areas). Thus, I create only 4 communicators and assign to each of them 2 file areas. In the previous printed messages the only aggregator for each communicator is printing its information.</font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">Then every process prints its group name, the number of process in the group, its rank in the old and new communicator:</font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">group-0:128:0:0</font></div><div><div><font face="arial, helvetica, sans-serif">group-0:128:1:1</font></div><div><font face="arial, helvetica, sans-serif">group-0:128:2:2</font></div><div><font face="arial, helvetica, sans-serif">group-0:128:3:3</font></div></div><div><font face="arial, helvetica, sans-serif">---</font></div><div><font face="arial, helvetica, sans-serif">group-0:128:127:127</font></div><div><font face="arial, helvetica, sans-serif">group-1:128:128:0</font></div><div><font face="arial, helvetica, sans-serif">group-1:128:129:1</font></div><div><font face="arial, helvetica, sans-serif">group-1:128:130:2</font></div><div><font face="arial, helvetica, sans-serif">group-1:128:131:3</font></div><div><div><font face="arial, helvetica, sans-serif">---</font></div><div><font face="arial, helvetica, sans-serif">group-1:128:255:127</font></div></div><div><div><font face="arial, helvetica, sans-serif">group-2:128:256:0</font></div><div><font face="arial, helvetica, sans-serif">group-2:128:257:1</font></div><div><font face="arial, helvetica, sans-serif">group-2:128:258:2</font></div><div><font face="arial, helvetica, sans-serif">group-2:128:259:3</font></div><div><div><font face="arial, helvetica, sans-serif">---</font></div><div><font face="arial, helvetica, sans-serif">group-2:128:383:127</font></div></div></div><div><div><font face="arial, helvetica, sans-serif">group-3:128:384:0</font></div><div><font face="arial, helvetica, sans-serif">group-3:128:385:1</font></div><div><font face="arial, helvetica, sans-serif">group-3:128:386:2</font></div><div><font face="arial, helvetica, sans-serif">group-3:128:387:3</font></div><div><div><font face="arial, helvetica, sans-serif">---</font></div><div><font face="arial, helvetica, sans-serif">group-3:128:511:127</font></div></div></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">So far it looks like the communicators are created properly. After calling the ADIOI_Calc_* functions every aggregator prints MIN and MAX offsets for every independent access range:</font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_write_coll.c:0] st_loc = 17179869184, end_loc = 25769803775</font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_write_coll.c:0] st_loc = 8589934592, end_loc = 17179869183</font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_write_coll.c:0] st_loc = 25769803776, end_loc = 34359738367</font></div><div><font face="arial, helvetica, sans-serif">[romio/adio/common/ad_write_coll.c:0] st_loc = 0, end_loc = 8589934591 <br></font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">Even though the new communicators seems to be created correctly, something bad happens when two phase I/O starts and the first MPI_Alltoall() is called to exchange access information among processes. MPI_Alltoall is called using the new communicators separately and every buffer passed to it has the new nprocs size for the corresponding communicator. Nevertheless, for some reason I am getting the following error message:</font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><div><font face="arial, helvetica, sans-serif">Fatal error in PMPI_Alltoall: Other MPI error, error stack:</font></div><div><font face="arial, helvetica, sans-serif">PMPI_Alltoall(888)......: MPI_Alltoall(sbuf=0x2551e98, scount=1, MPI_INT, rbuf=0x2551be8, rcount=1, MPI_INT, comm=0x84000001) failed</font></div><div><font face="arial, helvetica, sans-serif">MPIR_Alltoall_impl(760).:</font></div><div><font face="arial, helvetica, sans-serif">MPIR_Alltoall(725)......:</font></div><div><font face="arial, helvetica, sans-serif">MPIR_Alltoall_intra(283):</font></div></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">repeated many more times. </font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">This is actually strange because if I look at the file after coll_perf crashes there is some data in it (~200MB). Like if the problem was caused by only one communicator among the four.</font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">Does anybody have any idea of what is happening and why?</font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">Thanks,</font></div></div></div><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr">Giuseppe Congiu <strong>·</strong> Research Engineer II<br>
Seagate Technology, LLC<br>
office: +44 (0)23 9249 6082 <strong>·</strong> mobile: <br>
<a href="http://www.seagate.com" target="_blank">www.seagate.com</a><br></div></div>
</div>