[mpich-discuss] [EXTERNAL] Re: Issue with MPICH 4.0b1 and MPI I/O and ROMIO...maybe?

Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC] matthew.thompson at nasa.gov
Wed Dec 22 11:35:10 CST 2021


Rob,

Thanks. So for a cluster that is mainly "regular linux" (not sure what they run for non-GPFS), GPFS, some tmpfs probably, and a whole bunch of NFS, what would recommend as a good set of filesystems? Would something like:

  --with-file-system=gpfs

be enough, or once you have that flag does that do enough?

Also, I don't see --with-file-system in the configure help? Does the MPICH configure "pass down" the option to (I assume) ROMIO?

Matt
-- 
Matt Thompson, SSAI, Ld Scientific Programmer/Analyst
NASA GSFC,    Global Modeling and Assimilation Office
Code 610.1,  8800 Greenbelt Rd,  Greenbelt,  MD 20771
Phone: 301-614-6712                 Fax: 301-614-6246
http://science.gsfc.nasa.gov/sed/bio/matthew.thompson

On 12/20/21, 10:21 AM, "Latham, Robert J." <robl at mcs.anl.gov> wrote:

    On Fri, 2021-12-17 at 19:10 +0000, Thompson, Matt (GSFC-610.1)[SCIENCE
    SYSTEMS AND APPLICATIONS INC] via discuss wrote:
    > MPICH Discuss,
    NetCDF: Error initializing for parallel access
    >  
    > I did a bit of debugging and found that the crash was due to an
    > environment variable that was set because my application mistakenly
    > thought I was running Intel MPI (mainly because we didn't have
    > detection for MPICH, so it defaulted to our "default" on this cluster
    > of Intel MPI). When it sees Intel MPI, it sets:
    >  
    >   ROMIO_FSTYPE_FORCE="gpfs:"
    >  
    > which we've found is useful when running with Intel MPI on our GPFS
    > system.

    Hah I had no idea anyone besides me was using this little feature. Here's a bit more information for context:

    https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpress3.mcs.anl.gov%2Fromio%2F2019%2F02%2F20%2Fuseful-environment-variables%2F&data=04%7C01%7Cmatthew.thompson%40nasa.gov%7Caf61122a5ef345bbbf9408d9c3cc58fd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637756104698561618%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=FwCjPxubMan1uOvKoqqNTJED0QK6q5e2cfE%2FiCxu%2F%2BE%3D&reserved=0


    > So, of course the "right" thing to do is not to set that. (Doctor, it
    > hurts when I do this. So stop doing that.)
    >  
    > But it got me wondering, is there perhaps a "better" way I should be
    > building MPICH? Should this flag cause this sort of crash? Or does it
    > mean I build MPICH/ROMIO incorrectly or incompletely (no GPFS
    > support, say)?

    Intel MPI has its own way of supporting extra file systems, via the "I_MPI_EXTRA_FILESYSTEM" environment variable.  

    https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpress3.mcs.anl.gov%2Fromio%2F2014%2F06%2F12%2Fromio-and-intel-mpi%2F&data=04%7C01%7Cmatthew.thompson%40nasa.gov%7Caf61122a5ef345bbbf9408d9c3cc58fd%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637756104698561618%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ezuOrZ00lYjQ2GSxxmOiRZD8WmUfcnFO99Qp7w%2BcTSI%3D&reserved=0

    I'm open to suggestions on how ROMIO should best handle this situation.  You requested a file system (gpfs) that ROMIO did not support.  ROMIO could fall back to "generic unix file system" but you asked explicitly for GPFS, presumably for a good reason (or in your case, by accident but ROMIO is not a mind reader...)

    If you are building your own MPICH, add the `--with-file-system=...` flag.  that is a '+'-delmited list of file systems.  for example:

    `--with-file-system=ufs+testfs+gpfs+lustre+panfs+pvfs2+unify`

    I try to build everything I can so my list is quite long.  Yours will be shorter -- if you pick a file system for which you do not have development headers and libraries, your build will fail (hopefully at configure time).

    ==rob



More information about the discuss mailing list