[mpich-discuss] [OMPI devel] ROMIO+Lustre problems in OpenMPI 1.8.3

Paul Kapinos kapinos at itc.rwth-aachen.de
Thu Oct 30 11:51:33 CDT 2014


Hello Howard,

The version 1.8.1 installed on Jun 27 this year run fine, ROMIO is OK.

Trying ro re-run using the same install script: found out that also 1.8.1 
version of Open MPI now *cannot* build ROMIO support. Wow.

That means that the regression is not/not only in the OpenMPI's ROMIO, but 
depends on our Linux/Kernel/Lustre. Very first look: we've new kernel and new 
/usr/include/sys/quota.h and we probably update from SL6.4 to SL6.5.

- which information about our Linux/System do you need?
- Interest/Need in getting a guest login to get in-deepth feeling?

Best

Paul Kapinos

Attached: some logs from Instalation at 27.05 and today't try, and quota.h 
(changed at 29.09). Note that also the kernel changed (and maybe the Scientific 
Linux version from 6.4 to 6.5?)

pk224850 at cluster:~[502]$ ls -la /usr/include/sys/quota.h
-rw-r--r-- 1 root root 7903 Aug 29 21:11 /usr/include/sys/quota.h
pk224850 at cluster:~[503]$ uname -a
Linux cluster.rz.RWTH-Aachen.DE 2.6.32-431.29.2.el6.x86_64 #1 SMP Tue Sep 9 
13:45:55 CDT 2014 x86_64 x86_64 x86_64 GNU/Linux
pk224850 at cluster:~[504]$ cat /etc/issue
Scientific Linux release 6.5 (Carbon)
Kernel \r on an \m



On 10/29/14 19:06, Howard Pritchard wrote:
> Hi Paul,
>
> Thanks for the forward.  I've opened an issue #255
> <https://github.com/open-mpi/ompi/issues/255> to track the ROMIO config regression.
>
> Just to make sure, older releases of the 1.8 branch still configure and build
> properly with your
> current lustre setup?
>
> Thanks,
>
> Howard
>
>
> 2014-10-28 5:00 GMT-06:00 Paul Kapinos <kapinos at itc.rwth-aachen.de
> <mailto:kapinos at itc.rwth-aachen.de>>:
>
>     Dear Open MPI and ROMIO developer,
>
>     We use Open MPI v.1.6.x and 1.8.x in our cluster.
>     We have Lustre file system; we wish to use MPI_IO.
>     So the OpenMPI's are compiled with this flag:
>      > --with-io-romio-flags='--with-__file-system=testfs+ufs+nfs+__lustre'
>
>     In our newest installation openmpi/1.8.3 we found that MPI_IO is *broken*.
>
>     Short seek for root of the evil bring the following to light:
>
>     - the ROMIO component 'MCA io: romio' isn't here at all in the affected
>     version, because
>
>     - configure of ROMIO has *failed* (cf. logs (a,b,c).
>     - because lustre_user.h was found but could not be compiled.
>
>
>     In our system, there are two lustre_user.h available:
>     $ locate lustre_user.h
>     /usr/include/linux/lustre___user.h
>     /usr/include/lustre/lustre___user.h
>     As I'm not very convinient with lustre, I just attach both of them.
>
>     pk224850 at cluster:~[509]$ uname -a
>     Linux cluster.rz.RWTH-Aachen.DE <http://cluster.rz.RWTH-Aachen.DE>
>     2.6.32-431.29.2.el6.x86_64 #1 SMP Tue Sep 9 13:45:55 CDT 2014 x86_64 x86_64
>     x86_64 GNU/Linux
>
>     pk224850 at cluster:~[510]$ cat /etc/issue
>     Scientific Linux release 6.5 (Carbon)
>
>     Note that openmpi/1.8.1 seem to be fully OK (MPI_IO works) in our environment.
>
>     Best
>
>     Paul Kapinos
>
>     P.S. Is there a confugure flag, which will enforce ROMIO? That is when ROMIO
>     not available, configure would fail. This would make such hidden errors
>     publique at installation time..
>
>
>
>
>
>
>     a) Log in Open MPI's config.log:
>     ------------------------------__------------------------------__------------------
>     configure:226781: OMPI configuring in ompi/mca/io/romio/romio
>     configure:226866: running /bin/sh './configure'
>     --with-file-system=testfs+ufs+__nfs+lustre  FROM_OMPI=yes CC="icc -std=c99"
>     CFLAGS="-DNDEBUG -O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2 -m64
>     -finline-functions -fno-strict-aliasing -restrict -fexceptions
>     -Qoption,cpp,--extended_float___types -pthread" CPPFLAGS="
>     -I/w0/tmp/pk224850/linuxc2___9713/openmpi-1.8.3_linux64___intel/opal/mca/hwloc/hwloc172/__hwloc/include
>     -I/w0/tmp/pk224850/linuxc2___9713/openmpi-1.8.3_linux64___intel/opal/mca/event/__libevent2021/libevent
>     -I/w0/tmp/pk224850/linuxc2___9713/openmpi-1.8.3_linux64___intel/opal/mca/event/__libevent2021/libevent/include"
>     FFLAGS="-O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2   -m64  " LDFLAGS="-O3
>     -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2   -m64   -fexceptions "
>     --enable-shared --disable-static --with-file-system=testfs+ufs+__nfs+lustre
>     --prefix=/opt/MPI/openmpi-1.8.__3/linux/intel --disable-aio
>     --cache-file=/dev/null --srcdir=. --disable-option-checking
>     configure:226876: /bin/sh './configure' *failed* for ompi/mca/io/romio/romio
>     configure:226911: WARNING: ROMIO distribution did not configure successfully
>     configure:227425: checking if MCA component io:romio can compile
>     configure:227427: result: no
>     ------------------------------__------------------------------__------------------
>
>
>
>     b) dump of Open MPI's 'configure' output to the console:
>     ------------------------------__------------------------------__------------------
>     checking lustre/lustre_user.h usability... no
>     checking lustre/lustre_user.h presence... yes
>     configure: WARNING: lustre/lustre_user.h: present but cannot be compiled
>     configure: WARNING: lustre/lustre_user.h:     check for missing prerequisite
>     headers?
>     configure: WARNING: lustre/lustre_user.h: see the Autoconf documentation
>     configure: WARNING: lustre/lustre_user.h:     section "Present But Cannot Be
>     Compiled"
>     configure: WARNING: lustre/lustre_user.h: proceeding with the compiler's result
>     configure: WARNING:     ## ------------------------------__-- ##
>     configure: WARNING:     ## Report this to discuss at mpich.org
>     <mailto:discuss at mpich.org> ##
>     configure: WARNING:     ## ------------------------------__-- ##
>     checking for lustre/lustre_user.h... no
>     configure: error: LUSTRE support requested but cannot find
>     lustre/lustre_user.h header file
>     configure: /bin/sh './configure' *failed* for ompi/mca/io/romio/romio
>     configure: WARNING: ROMIO distribution did not configure successfully
>     checking if MCA component io:romio can compile... no
>     ------------------------------__------------------------------__------------------
>
>     c) ompi/mca/io/romio/romio's config.log:
>     ------------------------------__------------------------------__------------------
>     configure:20962: checking lustre/lustre_user.h usability
>     configure:20962: icc -std=c99 -c -DNDEBUG -O3 -ip -axAVX,SSE4.2,SSE4.1
>     -fp-model fast=2 -m64 -finline-functions -fno-strict-aliasing -restrict
>     -fexceptions -Qoption,cpp,--extended_float___types -pthread
>     -I/w0/tmp/pk224850/linuxc2___9713/openmpi-1.8.3_linux64___intel/opal/mca/hwloc/hwloc172/__hwloc/include
>     -I/w0/tmp/pk224850/linuxc2___9713/openmpi-1.8.3_linux64___intel/opal/mca/event/__libevent2021/libevent
>     -I/w0/tmp/pk224850/linuxc2___9713/openmpi-1.8.3_linux64___intel/opal/mca/event/__libevent2021/libevent/include
>     conftest.c >&5
>     /usr/include/sys/quota.h(221): error: identifier "caddr_t" is undefined
>                           caddr_t __addr) __THROW;
>                           ^
>
>     compilation aborted for conftest.c (code 2)
>     configure:20962: $? = 2
>     configure: failed program was:
>     | /* confdefs.h */
>     | #define PACKAGE_NAME "ROMIO"
>     | #define PACKAGE_TARNAME "romio"
>     | #define PACKAGE_VERSION "Open MPI"
>     | #define PACKAGE_STRING "ROMIO Open MPI"
>     | #define PACKAGE_BUGREPORT "discuss at mpich.org <mailto:discuss at mpich.org>"
>     | #define PACKAGE_URL "http://www.mpich.org/"
>     | #define PACKAGE "romio"
>     | #define VERSION "Open MPI"
>     | #define STDC_HEADERS 1
>     | #define HAVE_SYS_TYPES_H 1
>     | #define HAVE_SYS_STAT_H 1
>     | #define HAVE_STDLIB_H 1
>     | #define HAVE_STRING_H 1
>     | #define HAVE_MEMORY_H 1
>     | #define HAVE_STRINGS_H 1
>     | #define HAVE_INTTYPES_H 1
>     | #define HAVE_STDINT_H 1
>     | #define HAVE_UNISTD_H 1
>     | #define HAVE_DLFCN_H 1
>     | #define LT_OBJDIR ".libs/"
>     | #define HAVE_MPI_OFFSET 1
>     | #define HAVE_MEMALIGN 1
>     | #define HAVE_UNISTD_H 1
>     | #define HAVE_FCNTL_H 1
>     | #define HAVE_MALLOC_H 1
>     | #define HAVE_STDDEF_H 1
>     | #define HAVE_SYS_TYPES_H 1
>     | #define u_char unsigned char
>     | #define u_short unsigned short
>     | #define u_int unsigned int
>     | #define u_long unsigned long
>     | #define SIZEOF_INT 4
>     | #define SIZEOF_VOID_P 8
>     | #define INT_LT_POINTER 1
>     | #define HAVE_INT_LT_POINTER 1
>     | #define SIZEOF_LONG_LONG 8
>     | #define HAVE_LONG_LONG_64 1
>     | #define HAVE_MPI_LONG_LONG_INT 1
>     | #define HAVE_MPI_INFO 1
>     | #define ROMIO_NFS 1
>     | #define ROMIO_UFS 1
>     | #define ROMIO_TESTFS 1
>     | /* end confdefs.h.  */
>     | #include <stdio.h>
>     | #ifdef HAVE_SYS_TYPES_H
>     | # include <sys/types.h>
>     | #endif
>     | #ifdef HAVE_SYS_STAT_H
>     | # include <sys/stat.h>
>     | #endif
>     | #ifdef STDC_HEADERS
>     | # include <stdlib.h>
>     | # include <stddef.h>
>     | #else
>     | # ifdef HAVE_STDLIB_H
>     | #  include <stdlib.h>
>     | # endif
>     | #endif
>     | #ifdef HAVE_STRING_H
>     | # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
>     | #  include <memory.h>
>     | # endif
>     | # include <string.h>
>     | #endif
>     | #ifdef HAVE_STRINGS_H
>     | # include <strings.h>
>     | #endif
>     | #ifdef HAVE_INTTYPES_H
>     | # include <inttypes.h>
>     | #endif
>     | #ifdef HAVE_STDINT_H
>     | # include <stdint.h>
>     | #endif
>     | #ifdef HAVE_UNISTD_H
>     | # include <unistd.h>
>     | #endif
>     | #include <lustre/lustre_user.h>
>     configure:20962: result: no
>     configure:20962: checking lustre/lustre_user.h presence
>     configure:20962: icc -std=c99 -E
>     -I/w0/tmp/pk224850/linuxc2___9713/openmpi-1.8.3_linux64___intel/opal/mca/hwloc/hwloc172/__hwloc/include
>     -I/w0/tmp/pk224850/linuxc2___9713/openmpi-1.8.3_linux64___intel/opal/mca/event/__libevent2021/libevent
>     -I/w0/tmp/pk224850/linuxc2___9713/openmpi-1.8.3_linux64___intel/opal/mca/event/__libevent2021/libevent/include
>     conftest.c
>     configure:20962: $? = 0
>     configure:20962: result: yes
>     configure:20962: WARNING: lustre/lustre_user.h: present but cannot be compiled
>     configure:20962: WARNING: lustre/lustre_user.h:     check for missing
>     prerequisite headers?
>     configure:20962: WARNING: lustre/lustre_user.h: see the Autoconf documentation
>     configure:20962: WARNING: lustre/lustre_user.h:     section "Present But
>     Cannot Be Compiled"
>     configure:20962: WARNING: lustre/lustre_user.h: proceeding with the
>     compiler's result
>     configure:20962: checking for lustre/lustre_user.h
>     configure:20962: result: no
>     configure:20971: error: LUSTRE support requested but cannot find
>     lustre/lustre_user.h header file
>     ------------------------------__------------------------------__------------------
>
>
>
>
>     --
>     Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
>     RWTH Aachen University, IT Center
>     Seffenter Weg 23,  D 52074  Aachen (Germany)
>     Tel: +49 241/80-24915 <tel:%2B49%20241%2F80-24915>
>
>     _______________________________________________
>     devel mailing list
>     devel at open-mpi.org <mailto:devel at open-mpi.org>
>     Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>     Link to this post:
>     http://www.open-mpi.org/community/lists/devel/2014/10/16106.php
>
>
>
>
> _______________________________________________
> devel mailing list
> devel at open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/10/16127.php
>


-- 
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 1_Warning.txt
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20141030/98bad910/attachment.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4794 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20141030/98bad910/attachment.p7s>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list