[mpich-discuss] ROMIO+Lustre problems in OpenMPI 1.8.3
Rob Latham
robl at mcs.anl.gov
Tue Oct 28 09:32:29 CDT 2014
On 10/28/2014 06:00 AM, Paul Kapinos wrote:
> Dear Open MPI and ROMIO developer,
>
> We use Open MPI v.1.6.x and 1.8.x in our cluster.
> We have Lustre file system; we wish to use MPI_IO.
> So the OpenMPI's are compiled with this flag:
> > --with-io-romio-flags='--with-file-system=testfs+ufs+nfs+lustre'
>
> In our newest installation openmpi/1.8.3 we found that MPI_IO is *broken*.
>
> Short seek for root of the evil bring the following to light:
>
> - the ROMIO component 'MCA io: romio' isn't here at all in the affected
> version, because
>
> - configure of ROMIO has *failed* (cf. logs (a,b,c).
> - because lustre_user.h was found but could not be compiled.
lustre_user.h cannot be compiled because quota defines won't compile.
Ugh, what a mess.
A while back I noticed this and fixed it by removing an XOPEN_SOURCE
feature test macro:
http://trac.mpich.org/projects/mpich/ticket/1973
Then, on solaris with --enable-strict we needed to put *back* the
XOPEN_SOURCE macro or else pread and pwrite would be undefined.
So what I really need to to is delete XOPEN_SOURCE since it causes such
headaches, and on the rare platforms that only have pread/pwrite defined
if you take extraordinary measures, if at all, I'll have a ROMIO pread
and pwrite that simply do seek + write (or read).
For now, please delete the XOPEN_SOURCE line at the very beginning of
src/mpi/romio/adio/ad_lustre/ad_lustre_rwcontig.c
==rob
>
>
> In our system, there are two lustre_user.h available:
> $ locate lustre_user.h
> /usr/include/linux/lustre_user.h
> /usr/include/lustre/lustre_user.h
> As I'm not very convinient with lustre, I just attach both of them.
>
> pk224850 at cluster:~[509]$ uname -a
> Linux cluster.rz.RWTH-Aachen.DE 2.6.32-431.29.2.el6.x86_64 #1 SMP Tue
> Sep 9 13:45:55 CDT 2014 x86_64 x86_64 x86_64 GNU/Linux
>
> pk224850 at cluster:~[510]$ cat /etc/issue
> Scientific Linux release 6.5 (Carbon)
>
> Note that openmpi/1.8.1 seem to be fully OK (MPI_IO works) in our
> environment.
>
> Best
>
> Paul Kapinos
>
> P.S. Is there a confugure flag, which will enforce ROMIO? That is when
> ROMIO not available, configure would fail. This would make such hidden
> errors publique at installation time..
>
>
>
>
>
>
> a) Log in Open MPI's config.log:
> ------------------------------------------------------------------------------
>
> configure:226781: OMPI configuring in ompi/mca/io/romio/romio
> configure:226866: running /bin/sh './configure'
> --with-file-system=testfs+ufs+nfs+lustre FROM_OMPI=yes CC="icc
> -std=c99" CFLAGS="-DNDEBUG -O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2
> -m64 -finline-functions -fno-strict-aliasing -restrict -fexceptions
> -Qoption,cpp,--extended_float_types -pthread" CPPFLAGS="
> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/hwloc/hwloc172/hwloc/include
> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent
> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent/include"
> FFLAGS="-O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2 -m64 "
> LDFLAGS="-O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2 -m64
> -fexceptions " --enable-shared --disable-static
> --with-file-system=testfs+ufs+nfs+lustre
> --prefix=/opt/MPI/openmpi-1.8.3/linux/intel --disable-aio
> --cache-file=/dev/null --srcdir=. --disable-option-checking
> configure:226876: /bin/sh './configure' *failed* for
> ompi/mca/io/romio/romio
> configure:226911: WARNING: ROMIO distribution did not configure
> successfully
> configure:227425: checking if MCA component io:romio can compile
> configure:227427: result: no
> ------------------------------------------------------------------------------
>
>
>
>
> b) dump of Open MPI's 'configure' output to the console:
> ------------------------------------------------------------------------------
>
> checking lustre/lustre_user.h usability... no
> checking lustre/lustre_user.h presence... yes
> configure: WARNING: lustre/lustre_user.h: present but cannot be compiled
> configure: WARNING: lustre/lustre_user.h: check for missing
> prerequisite headers?
> configure: WARNING: lustre/lustre_user.h: see the Autoconf documentation
> configure: WARNING: lustre/lustre_user.h: section "Present But
> Cannot Be Compiled"
> configure: WARNING: lustre/lustre_user.h: proceeding with the compiler's
> result
> configure: WARNING: ## -------------------------------- ##
> configure: WARNING: ## Report this to discuss at mpich.org ##
> configure: WARNING: ## -------------------------------- ##
> checking for lustre/lustre_user.h... no
> configure: error: LUSTRE support requested but cannot find
> lustre/lustre_user.h header file
> configure: /bin/sh './configure' *failed* for ompi/mca/io/romio/romio
> configure: WARNING: ROMIO distribution did not configure successfully
> checking if MCA component io:romio can compile... no
> ------------------------------------------------------------------------------
>
>
> c) ompi/mca/io/romio/romio's config.log:
> ------------------------------------------------------------------------------
>
> configure:20962: checking lustre/lustre_user.h usability
> configure:20962: icc -std=c99 -c -DNDEBUG -O3 -ip -axAVX,SSE4.2,SSE4.1
> -fp-model fast=2 -m64 -finline-functions -fno-strict-aliasing -restrict
> -fexceptions -Qoption,cpp,--extended_float_types -pthread
> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/hwloc/hwloc172/hwloc/include
> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent
> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent/include
> conftest.c >&5
> /usr/include/sys/quota.h(221): error: identifier "caddr_t" is undefined
> caddr_t __addr) __THROW;
> ^
>
> compilation aborted for conftest.c (code 2)
> configure:20962: $? = 2
> configure: failed program was:
> | /* confdefs.h */
> | #define PACKAGE_NAME "ROMIO"
> | #define PACKAGE_TARNAME "romio"
> | #define PACKAGE_VERSION "Open MPI"
> | #define PACKAGE_STRING "ROMIO Open MPI"
> | #define PACKAGE_BUGREPORT "discuss at mpich.org"
> | #define PACKAGE_URL "http://www.mpich.org/"
> | #define PACKAGE "romio"
> | #define VERSION "Open MPI"
> | #define STDC_HEADERS 1
> | #define HAVE_SYS_TYPES_H 1
> | #define HAVE_SYS_STAT_H 1
> | #define HAVE_STDLIB_H 1
> | #define HAVE_STRING_H 1
> | #define HAVE_MEMORY_H 1
> | #define HAVE_STRINGS_H 1
> | #define HAVE_INTTYPES_H 1
> | #define HAVE_STDINT_H 1
> | #define HAVE_UNISTD_H 1
> | #define HAVE_DLFCN_H 1
> | #define LT_OBJDIR ".libs/"
> | #define HAVE_MPI_OFFSET 1
> | #define HAVE_MEMALIGN 1
> | #define HAVE_UNISTD_H 1
> | #define HAVE_FCNTL_H 1
> | #define HAVE_MALLOC_H 1
> | #define HAVE_STDDEF_H 1
> | #define HAVE_SYS_TYPES_H 1
> | #define u_char unsigned char
> | #define u_short unsigned short
> | #define u_int unsigned int
> | #define u_long unsigned long
> | #define SIZEOF_INT 4
> | #define SIZEOF_VOID_P 8
> | #define INT_LT_POINTER 1
> | #define HAVE_INT_LT_POINTER 1
> | #define SIZEOF_LONG_LONG 8
> | #define HAVE_LONG_LONG_64 1
> | #define HAVE_MPI_LONG_LONG_INT 1
> | #define HAVE_MPI_INFO 1
> | #define ROMIO_NFS 1
> | #define ROMIO_UFS 1
> | #define ROMIO_TESTFS 1
> | /* end confdefs.h. */
> | #include <stdio.h>
> | #ifdef HAVE_SYS_TYPES_H
> | # include <sys/types.h>
> | #endif
> | #ifdef HAVE_SYS_STAT_H
> | # include <sys/stat.h>
> | #endif
> | #ifdef STDC_HEADERS
> | # include <stdlib.h>
> | # include <stddef.h>
> | #else
> | # ifdef HAVE_STDLIB_H
> | # include <stdlib.h>
> | # endif
> | #endif
> | #ifdef HAVE_STRING_H
> | # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
> | # include <memory.h>
> | # endif
> | # include <string.h>
> | #endif
> | #ifdef HAVE_STRINGS_H
> | # include <strings.h>
> | #endif
> | #ifdef HAVE_INTTYPES_H
> | # include <inttypes.h>
> | #endif
> | #ifdef HAVE_STDINT_H
> | # include <stdint.h>
> | #endif
> | #ifdef HAVE_UNISTD_H
> | # include <unistd.h>
> | #endif
> | #include <lustre/lustre_user.h>
> configure:20962: result: no
> configure:20962: checking lustre/lustre_user.h presence
> configure:20962: icc -std=c99 -E
> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/hwloc/hwloc172/hwloc/include
> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent
> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent/include
> conftest.c
> configure:20962: $? = 0
> configure:20962: result: yes
> configure:20962: WARNING: lustre/lustre_user.h: present but cannot be
> compiled
> configure:20962: WARNING: lustre/lustre_user.h: check for missing
> prerequisite headers?
> configure:20962: WARNING: lustre/lustre_user.h: see the Autoconf
> documentation
> configure:20962: WARNING: lustre/lustre_user.h: section "Present But
> Cannot Be Compiled"
> configure:20962: WARNING: lustre/lustre_user.h: proceeding with the
> compiler's result
> configure:20962: checking for lustre/lustre_user.h
> configure:20962: result: no
> configure:20971: error: LUSTRE support requested but cannot find
> lustre/lustre_user.h header file
> ------------------------------------------------------------------------------
>
>
>
>
>
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list