[mpich-discuss] MPI-IO bug
Rob Latham
robl at mcs.anl.gov
Wed Apr 30 16:06:29 CDT 2014
Wei-keng, I've already let this slip longer than I intended. I've
opened a ticket [1], which I intend to close pretty quickly
http://trac.mpich.org/projects/mpich/ticket/2073
==rob
On 04/09/2014 02:52 PM, Wei-keng Liao wrote:
> (This bug is probably caused by my patch long ago.)
> Attached is a program extracted from an application that can reproduce
> the problem
> observed from a large run. The problem is when defining a filetype using
> MPI_Type_indexed
> and the first few elements of argument blocklens[] are zeros, a
> collective write will
> miss writing some data.
>
> The test program first fills a file with 9 integers with values all -999.
> It then defines a filetype and writes to the file in parallel with user
> buffers
> with value all 1s. Lastly, the file is read back and checked for contents.
>
> Command used to compile and run the test program:
> mpicc -g -o bug_indexed_io bug_indexed_io.c
> mpiexec -n 2 bug_indexed_io
>
> Stdout:
> 0: Error: unexpected varlue at buf[7] == -999
>
>
> The patch below can fix this problem. Hope it does not break other tests.
>
> Index: adio/common/ad_read_coll.c
> @@ -368,13 +368,16 @@
> #endif
> if (file_ptr_type == ADIO_INDIVIDUAL) {
> /* Wei-keng reworked type processing to be a bit more
> efficient */
> + for (i=0; i<flat_file->count; i++) /* skip blocklens[] == 0 */
> + if (flat_file->blocklens[i] > 0) break;
> +
> offset = fd->fp_ind - disp;
> - n_filetypes = (offset - flat_file->indices[0]) /
> filetype_extent;
> + n_filetypes = (offset - flat_file->indices[i]) /
> filetype_extent;
> offset -= (ADIO_Offset)n_filetypes * filetype_extent;
> /* now offset is local to this extent */
>
> /* find the block where offset is located, skip
> blocklens[i]==0 */
> - for (i=0; i<flat_file->count; i++) {
> + for (; i<flat_file->count; i++) {
> ADIO_Offset dist;
> if (flat_file->blocklens[i] == 0) continue;
> dist = flat_file->indices[i] + flat_file->blocklens[i]
> - offset;
>
> Wei-keng
>
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
More information about the discuss
mailing list