[mpich-discuss] [OMPI devel] ROMIO+Lustre problems in OpenMPI 1.8.3

Rob Latham robl at mcs.anl.gov
Thu Feb 26 11:34:20 CST 2015



On 11/07/2014 06:26 AM, Ralph Castain wrote:
> Hi Rob
>
> Following up on this: I cannot find any reference to XOPEN_SOURCE in our included ROMIO source for Lustre. I only found one reference anywhere in ROMIO:
>
> romio/adio/ad_xfs/ad_xfs.h:11:#define _XOPEN_SOURCE 500
>
> Any other suggestions on what could be causing the problem?

I've fixed this in ROMIO by not mucking around with XOPEN_SOURCE at all, 
in either lustre or xfs or anywhere.

http://git.mpich.org/mpich.git/commit/4e80e1d2b
and
http://git.mpich.org/mpich.git/commit/5a10283bf7
==rob

>
> Thanks
> Ralph
>
>
>> On Oct 28, 2014, at 7:32 AM, Rob Latham <robl at mcs.anl.gov> wrote:
>>
>>
>>
>> On 10/28/2014 06:00 AM, Paul Kapinos wrote:
>>> Dear Open MPI and ROMIO developer,
>>>
>>> We use Open MPI v.1.6.x and 1.8.x in our cluster.
>>> We have Lustre file system; we wish to use MPI_IO.
>>> So the OpenMPI's are compiled with this flag:
>>>> --with-io-romio-flags='--with-file-system=testfs+ufs+nfs+lustre'
>>>
>>> In our newest installation openmpi/1.8.3 we found that MPI_IO is *broken*.
>>>
>>> Short seek for root of the evil bring the following to light:
>>>
>>> - the ROMIO component 'MCA io: romio' isn't here at all in the affected
>>> version, because
>>>
>>> - configure of ROMIO has *failed* (cf. logs (a,b,c).
>>> - because lustre_user.h was found but could not be compiled.
>>
>> lustre_user.h cannot be compiled because quota defines won't compile. Ugh, what a mess.
>>
>> A while back I noticed this and fixed it by removing an XOPEN_SOURCE feature test macro:
>>
>> http://trac.mpich.org/projects/mpich/ticket/1973
>>
>> Then, on solaris with --enable-strict we needed to put *back* the XOPEN_SOURCE macro or else pread and pwrite would be undefined.
>>
>> So what I really need to to is delete XOPEN_SOURCE since it causes such headaches, and on the rare platforms that only have pread/pwrite defined if you take extraordinary measures, if at all, I'll have a ROMIO pread and pwrite that simply do seek + write (or read).
>>
>> For now, please delete the XOPEN_SOURCE line at the very beginning of src/mpi/romio/adio/ad_lustre/ad_lustre_rwcontig.c
>>
>> ==rob
>>
>>
>>>
>>>
>>> In our system, there are two lustre_user.h available:
>>> $ locate lustre_user.h
>>> /usr/include/linux/lustre_user.h
>>> /usr/include/lustre/lustre_user.h
>>> As I'm not very convinient with lustre, I just attach both of them.
>>>
>>> pk224850 at cluster:~[509]$ uname -a
>>> Linux cluster.rz.RWTH-Aachen.DE 2.6.32-431.29.2.el6.x86_64 #1 SMP Tue
>>> Sep 9 13:45:55 CDT 2014 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> pk224850 at cluster:~[510]$ cat /etc/issue
>>> Scientific Linux release 6.5 (Carbon)
>>>
>>> Note that openmpi/1.8.1 seem to be fully OK (MPI_IO works) in our
>>> environment.
>>>
>>> Best
>>>
>>> Paul Kapinos
>>>
>>> P.S. Is there a confugure flag, which will enforce ROMIO? That is when
>>> ROMIO not available, configure would fail. This would make such hidden
>>> errors publique at installation time..
>>>
>>>
>>>
>>>
>>>
>>>
>>> a) Log in Open MPI's config.log:
>>> ------------------------------------------------------------------------------
>>>
>>> configure:226781: OMPI configuring in ompi/mca/io/romio/romio
>>> configure:226866: running /bin/sh './configure'
>>> --with-file-system=testfs+ufs+nfs+lustre  FROM_OMPI=yes CC="icc
>>> -std=c99" CFLAGS="-DNDEBUG -O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2
>>> -m64 -finline-functions -fno-strict-aliasing -restrict -fexceptions
>>> -Qoption,cpp,--extended_float_types -pthread" CPPFLAGS="
>>> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/hwloc/hwloc172/hwloc/include
>>> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent
>>> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent/include"
>>> FFLAGS="-O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2   -m64  "
>>> LDFLAGS="-O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2   -m64
>>> -fexceptions " --enable-shared --disable-static
>>> --with-file-system=testfs+ufs+nfs+lustre
>>> --prefix=/opt/MPI/openmpi-1.8.3/linux/intel --disable-aio
>>> --cache-file=/dev/null --srcdir=. --disable-option-checking
>>> configure:226876: /bin/sh './configure' *failed* for
>>> ompi/mca/io/romio/romio
>>> configure:226911: WARNING: ROMIO distribution did not configure
>>> successfully
>>> configure:227425: checking if MCA component io:romio can compile
>>> configure:227427: result: no
>>> ------------------------------------------------------------------------------
>>>
>>>
>>>
>>>
>>> b) dump of Open MPI's 'configure' output to the console:
>>> ------------------------------------------------------------------------------
>>>
>>> checking lustre/lustre_user.h usability... no
>>> checking lustre/lustre_user.h presence... yes
>>> configure: WARNING: lustre/lustre_user.h: present but cannot be compiled
>>> configure: WARNING: lustre/lustre_user.h:     check for missing
>>> prerequisite headers?
>>> configure: WARNING: lustre/lustre_user.h: see the Autoconf documentation
>>> configure: WARNING: lustre/lustre_user.h:     section "Present But
>>> Cannot Be Compiled"
>>> configure: WARNING: lustre/lustre_user.h: proceeding with the compiler's
>>> result
>>> configure: WARNING:     ## -------------------------------- ##
>>> configure: WARNING:     ## Report this to discuss at mpich.org ##
>>> configure: WARNING:     ## -------------------------------- ##
>>> checking for lustre/lustre_user.h... no
>>> configure: error: LUSTRE support requested but cannot find
>>> lustre/lustre_user.h header file
>>> configure: /bin/sh './configure' *failed* for ompi/mca/io/romio/romio
>>> configure: WARNING: ROMIO distribution did not configure successfully
>>> checking if MCA component io:romio can compile... no
>>> ------------------------------------------------------------------------------
>>>
>>>
>>> c) ompi/mca/io/romio/romio's config.log:
>>> ------------------------------------------------------------------------------
>>>
>>> configure:20962: checking lustre/lustre_user.h usability
>>> configure:20962: icc -std=c99 -c -DNDEBUG -O3 -ip -axAVX,SSE4.2,SSE4.1
>>> -fp-model fast=2 -m64 -finline-functions -fno-strict-aliasing -restrict
>>> -fexceptions -Qoption,cpp,--extended_float_types -pthread
>>> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/hwloc/hwloc172/hwloc/include
>>> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent
>>> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent/include
>>> conftest.c >&5
>>> /usr/include/sys/quota.h(221): error: identifier "caddr_t" is undefined
>>>                 caddr_t __addr) __THROW;
>>>                 ^
>>>
>>> compilation aborted for conftest.c (code 2)
>>> configure:20962: $? = 2
>>> configure: failed program was:
>>> | /* confdefs.h */
>>> | #define PACKAGE_NAME "ROMIO"
>>> | #define PACKAGE_TARNAME "romio"
>>> | #define PACKAGE_VERSION "Open MPI"
>>> | #define PACKAGE_STRING "ROMIO Open MPI"
>>> | #define PACKAGE_BUGREPORT "discuss at mpich.org"
>>> | #define PACKAGE_URL "http://www.mpich.org/"
>>> | #define PACKAGE "romio"
>>> | #define VERSION "Open MPI"
>>> | #define STDC_HEADERS 1
>>> | #define HAVE_SYS_TYPES_H 1
>>> | #define HAVE_SYS_STAT_H 1
>>> | #define HAVE_STDLIB_H 1
>>> | #define HAVE_STRING_H 1
>>> | #define HAVE_MEMORY_H 1
>>> | #define HAVE_STRINGS_H 1
>>> | #define HAVE_INTTYPES_H 1
>>> | #define HAVE_STDINT_H 1
>>> | #define HAVE_UNISTD_H 1
>>> | #define HAVE_DLFCN_H 1
>>> | #define LT_OBJDIR ".libs/"
>>> | #define HAVE_MPI_OFFSET 1
>>> | #define HAVE_MEMALIGN 1
>>> | #define HAVE_UNISTD_H 1
>>> | #define HAVE_FCNTL_H 1
>>> | #define HAVE_MALLOC_H 1
>>> | #define HAVE_STDDEF_H 1
>>> | #define HAVE_SYS_TYPES_H 1
>>> | #define u_char unsigned char
>>> | #define u_short unsigned short
>>> | #define u_int unsigned int
>>> | #define u_long unsigned long
>>> | #define SIZEOF_INT 4
>>> | #define SIZEOF_VOID_P 8
>>> | #define INT_LT_POINTER 1
>>> | #define HAVE_INT_LT_POINTER 1
>>> | #define SIZEOF_LONG_LONG 8
>>> | #define HAVE_LONG_LONG_64 1
>>> | #define HAVE_MPI_LONG_LONG_INT 1
>>> | #define HAVE_MPI_INFO 1
>>> | #define ROMIO_NFS 1
>>> | #define ROMIO_UFS 1
>>> | #define ROMIO_TESTFS 1
>>> | /* end confdefs.h.  */
>>> | #include <stdio.h>
>>> | #ifdef HAVE_SYS_TYPES_H
>>> | # include <sys/types.h>
>>> | #endif
>>> | #ifdef HAVE_SYS_STAT_H
>>> | # include <sys/stat.h>
>>> | #endif
>>> | #ifdef STDC_HEADERS
>>> | # include <stdlib.h>
>>> | # include <stddef.h>
>>> | #else
>>> | # ifdef HAVE_STDLIB_H
>>> | #  include <stdlib.h>
>>> | # endif
>>> | #endif
>>> | #ifdef HAVE_STRING_H
>>> | # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
>>> | #  include <memory.h>
>>> | # endif
>>> | # include <string.h>
>>> | #endif
>>> | #ifdef HAVE_STRINGS_H
>>> | # include <strings.h>
>>> | #endif
>>> | #ifdef HAVE_INTTYPES_H
>>> | # include <inttypes.h>
>>> | #endif
>>> | #ifdef HAVE_STDINT_H
>>> | # include <stdint.h>
>>> | #endif
>>> | #ifdef HAVE_UNISTD_H
>>> | # include <unistd.h>
>>> | #endif
>>> | #include <lustre/lustre_user.h>
>>> configure:20962: result: no
>>> configure:20962: checking lustre/lustre_user.h presence
>>> configure:20962: icc -std=c99 -E
>>> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/hwloc/hwloc172/hwloc/include
>>> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent
>>> -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent/include
>>> conftest.c
>>> configure:20962: $? = 0
>>> configure:20962: result: yes
>>> configure:20962: WARNING: lustre/lustre_user.h: present but cannot be
>>> compiled
>>> configure:20962: WARNING: lustre/lustre_user.h:     check for missing
>>> prerequisite headers?
>>> configure:20962: WARNING: lustre/lustre_user.h: see the Autoconf
>>> documentation
>>> configure:20962: WARNING: lustre/lustre_user.h:     section "Present But
>>> Cannot Be Compiled"
>>> configure:20962: WARNING: lustre/lustre_user.h: proceeding with the
>>> compiler's result
>>> configure:20962: checking for lustre/lustre_user.h
>>> configure:20962: result: no
>>> configure:20971: error: LUSTRE support requested but cannot find
>>> lustre/lustre_user.h header file
>>> ------------------------------------------------------------------------------
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>
>> --
>> Rob Latham
>> Mathematics and Computer Science Division
>> Argonne National Lab, IL USA
>> _______________________________________________
>> devel mailing list
>> devel at open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/10/16109.php
>

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list