<meta http-equiv="Content-Type" content="text/html; charset=utf-8"><div dir="ltr"><br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><font face="arial, helvetica, sans-serif">Thanks for clarification. I see type and filetype specification in the standard mention "monotonically nondecreasing" constraint.</font></div><div><font face="arial, helvetica, sans-serif"></font></div></div></blockquote><div><br></div><div>I mean etype and filetype.</div><div><br></div><div>-Pramod</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><div class="gmail_quote">On Wed, Aug 23, 2017 at 3:07 AM, Latham, Robert J. <span dir="ltr"><<a href="mailto:robl@mcs.anl.gov" target="_blank">robl@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>On Tue, 2017-08-22 at 22:27 +0000, Thakur, Rajeev wrote:<br>
> Yes, displacements for the filetype must be in “monotonically<br>
> nondecreasing order”.<br>
<br>
</span>... which sounds pretty restrictive, but there is no constraint on<br>
memory types. Folks work around this by shuffling the memory addresses<br>
to match the ascending file offsets.<br>
<br>
==rob<br>
<div class="m_-1297466834448786297HOEnZb"><div class="m_-1297466834448786297h5"><br>
><br>
> Rajeev<br>
><br>
> > On Aug 22, 2017, at 3:05 PM, pramod kumbhar <pramod.s.kumbhar@gmail<br>
> > .com> wrote:<br>
> ><br>
> > Hi Rob,<br>
> ><br>
> > Thanks! Below is not exactly same issue/error but related :<br>
> ><br>
> > While constructing derived datatype (filetype used for set_view),<br>
> > do we need displacements / offsets to be in ascending order?<br>
> > I mean, suppose I am creating derived datatype using<br>
> > MPI_Type_create_hindexed (or mpi struct) with length/displacements<br>
> > as:<br>
> ><br>
> > blocklengths[0] = 8;<br>
> > blocklengths[1] = 231670;<br>
> > blocklengths[2] = 116606;<br>
> ><br>
> > displacements[0] = 0;<br>
> > displacements[1] = 8;<br>
> > displacements[2] = 231678;<br>
> ><br>
> > Above displacements are in ascending order. Suppose I shuffle order<br>
> > bit:<br>
> ><br>
> > blocklengths[0] = 8;<br>
> > blocklengths[1] = 116606;<br>
> > blocklengths[2] = 231670;<br>
> ><br>
> > displacements[0] = 0;<br>
> > displacements[1] = 231678;<br>
> > displacements[2] = 8;<br>
> ><br>
> > It's still the same but while specifying block-lengths/offsets I<br>
> > changed the order. (resultant file will have data in different oder<br>
> > but that's ignored here)<br>
> > Isn't this a valid specification? This second example results in a<br>
> > segfault (in ADIO_GEN_WriteStrided / Coll).<br>
> ><br>
> > I quickly wrote attached program, let me know if I have missed<br>
> > anything obvious here.<br>
> ><br>
> > Regards,<br>
> > Pramod<br>
> ><br>
> > p.s. you can compile & run as:<br>
> ><br>
> > Not working => mpicxx test.cpp && mpirun -n 2 ./a.out<br>
> > Working =>. mpicxx test.cpp -DUSE_ORDER && mpirun -n 2 ./a.out<br>
> ><br>
> ><br>
> ><br>
> > On Tue, Aug 22, 2017 at 5:25 PM, Latham, Robert J. <robl@mcs.anl.go<br>
> > v> wrote:<br>
> > On Mon, 2017-08-21 at 17:45 +0200, pramod kumbhar wrote:<br>
> > > Dear All,<br>
> > ><br>
> > > In one of our application I am seeing following error while using<br>
> > > collective call MPI_File_write_all :<br>
> > ><br>
> > > Error in ADIOI_Calc_aggregator(): rank_index(1) >= fd->hints-<br>
> > > > cb_nodes (1) fd_size=102486061 off=102486469<br>
> > ><br>
> > > Non collective version works fine.<br>
> > ><br>
> > > While looking at callstack I came across below comment in mpich-<br>
> > > 3.2/src/mpi/romio/adio/common/<wbr>ad_aggregate.c :<br>
> > ><br>
> > > /* we index into fd_end with rank_index, and fd_end was<br>
> > > allocated<br>
> > > to be no<br>
> > > * bigger than fd->hins->cb_nodes. If we ever violate that,<br>
> > > we're<br>
> > > * overrunning arrays. Obviously, we should never ever hit<br>
> > > this<br>
> > > abort */<br>
> > > if (rank_index >= fd->hints->cb_nodes || rank_index < 0) {<br>
> > > FPRINTF(stderr, "Error in ADIOI_Calc_aggregator():<br>
> > > rank_index(%d) >= fd->hints->cb_nodes (%d) fd_size=%lld<br>
> > > off=%lld\n",<br>
> > > rank_index,fd->hints->cb_node<wbr>s,fd_size,off);<br>
> > > MPI_Abort(MPI_COMM_WORLD, 1);<br>
> > > }<br>
> > ><br>
> > > I am going to look into application and see if there is an issue<br>
> > > with<br>
> > > offset overflow. But looking at above comment ("Obviously, we<br>
> > > should<br>
> > > never ever hit this abort ") I thought should ask if there is any<br>
> > > obvious thing I am missing.<br>
> ><br>
> > that's my comment. The 'rank_index' array is allocated based on<br>
> > the<br>
> > 'cb_nodes' hint. I definitely would like to know more about how<br>
> > the<br>
> > code is manipulating rank_index, cb_nodes, and fd_end .<br>
> ><br>
> > If there is a reduced test case you can send me, that will be a<br>
> > huge<br>
> > help.<br>
> ><br>
> > ==rob<br>
> ><br>
> > ><br>
> > > Regards,<br>
> > > Pramod<br>
> > ><br>
> > > p.s. I will provide reproducer after looking into this more<br>
> > > carefully.<br>
> > > ______________________________<wbr>_________________<br>
> > > discuss mailing list <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a><br>
> > > To manage subscription options or unsubscribe:<br>
> > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" rel="noreferrer" target="_blank">https://lists.mpich.org/mailma<wbr>n/listinfo/discuss</a><br>
> ><br>
> > ______________________________<wbr>_________________<br>
> > discuss mailing list <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a><br>
> > To manage subscription options or unsubscribe:<br>
> > <a href="https://lists.mpich.org/mailman/listinfo/discuss" rel="noreferrer" target="_blank">https://lists.mpich.org/mailma<wbr>n/listinfo/discuss</a><br>
> ><br>
> > <test.cpp>____________________<wbr>___________________________<br>
> > discuss mailing list <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a><br>
> > To manage subscription options or unsubscribe:<br>
> > <a href="https://lists.mpich.org/mailman/listinfo/discuss" rel="noreferrer" target="_blank">https://lists.mpich.org/mailma<wbr>n/listinfo/discuss</a><br>
><br>
> ______________________________<wbr>_________________<br>
> discuss mailing list <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a><br>
> To manage subscription options or unsubscribe:<br>
> <a href="https://lists.mpich.org/mailman/listinfo/discuss" rel="noreferrer" target="_blank">https://lists.mpich.org/mailma<wbr>n/listinfo/discuss</a><br>
______________________________<wbr>_________________<br>
discuss mailing list <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" rel="noreferrer" target="_blank">https://lists.mpich.org/mailma<wbr>n/listinfo/discuss</a></div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div></div>