[mpich-discuss] About an error while using mpi i/o collectives : "Error in ADIOI_Calc_aggregator(): rank_index(1)..."

Latham, Robert J. robl at mcs.anl.gov
Tue Aug 22 10:25:37 CDT 2017


On Mon, 2017-08-21 at 17:45 +0200, pramod kumbhar wrote:
> Dear All,
> 
> In one of our application I am seeing following error while using
> collective call MPI_File_write_all :
> 
> Error in ADIOI_Calc_aggregator(): rank_index(1) >= fd->hints-
> >cb_nodes (1) fd_size=102486061 off=102486469
> 
> Non collective version works fine.
> 
> While looking at callstack I came across below comment in mpich-
> 3.2/src/mpi/romio/adio/common/ad_aggregate.c :
> 
>     /* we index into fd_end with rank_index, and fd_end was allocated
> to be no
>      * bigger than fd->hins->cb_nodes.   If we ever violate that,
> we're
>      * overrunning arrays.  Obviously, we should never ever hit this
> abort */
>     if (rank_index >= fd->hints->cb_nodes || rank_index < 0) {
>         FPRINTF(stderr, "Error in ADIOI_Calc_aggregator():
> rank_index(%d) >= fd->hints->cb_nodes (%d) fd_size=%lld  
> off=%lld\n",
>             rank_index,fd->hints->cb_nodes,fd_size,off);
>         MPI_Abort(MPI_COMM_WORLD, 1);
>     }
> 
> I am going to look into application and see if there is an issue with
> offset overflow. But looking at above comment ("Obviously, we should
> never ever hit this abort ") I thought should ask if there is any
> obvious thing I am missing. 

that's my comment.  The 'rank_index' array is allocated based on the
'cb_nodes' hint.  I definitely would like to know more about how the
code is manipulating rank_index, cb_nodes, and fd_end .

If there is a reduced test case you can send me, that will be a huge
help.

==rob

> 
> Regards,
> Pramod
> 
> p.s. I will provide reproducer after looking into this more
> carefully.
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list