[mpich-discuss] About an error while using mpi i/o collectives : "Error in ADIOI_Calc_aggregator(): rank_index(1)..."

pramod kumbhar pramod.s.kumbhar at gmail.com
Mon Aug 21 10:45:52 CDT 2017

Dear All,

In one of our application I am seeing following error while using
collective call MPI_File_write_all :

Error in ADIOI_Calc_aggregator(): rank_index(1) >= fd->hints->cb_nodes (1)
fd_size=102486061 off=102486469

Non collective version works fine.

While looking at callstack I came across below comment
in mpich-3.2/src/mpi/romio/adio/common/ad_aggregate.c :

    /* we index into fd_end with rank_index, and fd_end was allocated to be
     * bigger than fd->hins->cb_nodes.   If we ever violate that, we're
     ** overrunning arrays.  Obviously, we should never ever hit this abort
    if (rank_index >= fd->hints->cb_nodes || rank_index < 0) {
        FPRINTF(stderr, "Error in ADIOI_Calc_aggregator(): rank_index(%d)
>= fd->hints->cb_nodes (%d) fd_size=%lld   off=%lld\n",
        MPI_Abort(MPI_COMM_WORLD, 1);

I am going to look into application and see if there is an issue with
offset overflow. But looking at above comment ("Obviously, we should never
ever hit this abort ") I thought should ask if there is any obvious thing I
am missing.


p.s. I will provide reproducer after looking into this more carefully.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20170821/7f021f84/attachment.html>
-------------- next part --------------
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:

More information about the discuss mailing list