[mpich-discuss] ROMIO filesystem check during MPI_File_open

Rob Latham robl at mcs.anl.gov
Mon Mar 24 16:17:58 CDT 2014



On 03/24/2014 02:49 PM, Jeff Squyres (jsquyres) wrote:
> On Mar 14, 2014, at 11:23 AM, Rob Latham <robl at mcs.anl.gov> wrote:
>
>> I thought we handled this?  we certianly seem to have made an effort:
>>
>> https://trac.mpich.org/projects/mpich/browser/src/mpi/romio/adio/common/ad_fstype.c#L644
>
> Sorry for the delay in getting back to this...
>
> It looks like ADIO_FileSysType_fncall() is *not* collective -- it just does some magic to figure out what the local filesystem type is.

that's correct.  we expect _fncall to be called from one process in the 
"no nfs" case, and from every process in the "nfs enabled" case.

>  Then back up in ADIO_ResolveFileType(),

that's called from two locations: MPI_File_open and MPI_File_delete. 
the only two functions that take a path and not an MPI_File handle.

Open is collective, delete is not, but delete calls ResolveFileType with 
COMM_SELF (and you're not worried about the delete path anyway)

> since have_nfs_enabled==1 and ADIO_FileSysType_fncall() returned MPI_SUCCESS,

Every process will call ADIO_FileSysType_fncall() in the case where 
have_nfs_enabled==1 .  Your interpretation of the code is so different 
from mine that I'm just going to have to past what I'm looking at and 
you can tell me where I'm wrong.  These lines come from MPICH's 
ad_fstype.c.  The content is the same in openmpi-1.6.4, just shifted by 
25 or so lines:

637     ADIO_FileSysType_fncall(filename, &file_system, &myerrcode);
638     if (myerrcode != MPI_SUCCESS) {
639         *error_code = myerrcode;
640
[ giant comment omitted...]
651
652         MPI_Allreduce(error_code, &max_code, 1, MPI_INT, MPI_MAX, comm);
653         if (max_code != MPI_SUCCESS)  {
654             *error_code = max_code;
655             return;
656         }
657         /* ensure everyone came up with the same file system type */
658         MPI_Allreduce(&file_system, &min_code, 1, MPI_INT,
659                 MPI_MIN, comm);
660         if (min_code == ADIO_NFS) file_system = ADIO_NFS;
661     }


> MPI_Allreduce() is *not* called, and each process just proceeds with their local value for file_system.

as you can see, there is an ALLREDUCE on the error code.  Did anyone run 
into problems when we stat-ed the file system?  If so, bail out early. 
If not, then let's figure out the common file system with a second 
ALLREDUCE.

>  That's where things go downhill: one process will get UFS, the rest will get NFS.

Yeah, that would be a problem, and I'll gladly fix this code, but I 
can't see where we are failing.  Am I looking in the right place?

==rob




-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA



More information about the discuss mailing list