[mpich-discuss] MPI-IO bug

Wei-keng Liao wkliao at eecs.northwestern.edu
Wed Apr 9 14:52:28 CDT 2014


(This bug is probably caused by my patch long ago.)
Attached is a program extracted from an application that can reproduce the problem
observed from a large run. The problem is when defining a filetype using MPI_Type_indexed
and the first few elements of argument blocklens[] are zeros, a collective write will
miss writing some data.

The test program first fills a file with 9 integers with values all -999.
It then defines a filetype and writes to the file in parallel with user buffers
with value all 1s. Lastly, the file is read back and checked for contents.

Command used to compile and run the test program:
    mpicc -g -o bug_indexed_io bug_indexed_io.c
    mpiexec -n 2 bug_indexed_io

Stdout:
   0: Error: unexpected varlue at buf[7] == -999


The patch below can fix this problem. Hope it does not break other tests.

Index: adio/common/ad_read_coll.c
@@ -368,13 +368,16 @@
 #endif
 	if (file_ptr_type == ADIO_INDIVIDUAL) {
            /* Wei-keng reworked type processing to be a bit more efficient */
+            for (i=0; i<flat_file->count; i++) /* skip blocklens[] == 0 */
+                if (flat_file->blocklens[i] > 0) break;
+
             offset       = fd->fp_ind - disp;
-            n_filetypes  = (offset - flat_file->indices[0]) / filetype_extent;
+            n_filetypes  = (offset - flat_file->indices[i]) / filetype_extent;
             offset      -= (ADIO_Offset)n_filetypes * filetype_extent;
 	    /* now offset is local to this extent */
  
             /* find the block where offset is located, skip blocklens[i]==0 */
-            for (i=0; i<flat_file->count; i++) {
+            for (; i<flat_file->count; i++) {
                 ADIO_Offset dist;
                 if (flat_file->blocklens[i] == 0) continue;
                 dist = flat_file->indices[i] + flat_file->blocklens[i] - offset;

Wei-keng

-------------- next part --------------
A non-text attachment was scrubbed...
Name: bug_indexed_io.c
Type: application/octet-stream
Size: 2189 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140409/2772aa2b/attachment.obj>
-------------- next part --------------




More information about the discuss mailing list