[mpich-discuss] MPICH3 and Nagfor: Corrupts writing/IO?

Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC] matthew.thompson at nasa.gov
Tue Jan 17 12:18:43 CST 2017


Ken,

I just tried your configure (with enable-cxx) and it seemed to work! Thanks!

I suppose I should try and figure out what flag caused the corruption 
but it's working. Don't really want to poke the bear right now.

Matt


On 01/12/2017 04:46 PM, Kenneth Raffenetti wrote:
> Can you check your compiler flags and make sure they are all necessary?
> I was able to reproduce the error locally with your settings, but a more
> default configuration works fine. I.e.
>
> ./configure --prefix=$PWD/i --disable-wrapper-rpath CC=gcc CXX=g++
> FC=nagfor F77=nagfor FCFLAGS=-mismatch FFLAGS=-mismatch
> --enable-fortran=all
>
> Ken
>
> On 01/12/2017 10:54 AM, Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND
> APPLICATIONS INC] wrote:
>> All,
>>
>> I've been having some "fun" recently trying to get an MPI stack built
>> with nagfor 6.1. I've tried Open MPI and MVAPICH2 and failed to even
>> build the MPI stack, SGI MPT doesn't like nagfor, and Intel MPI I'm
>> guessing wouldn't either.
>>
>> So, I figured I'd go for MPICH3. And, lo and behold, building with
>> nagfor 6.1 and gcc 5.3 (for CC and CXX) with:
>>
>>>  ./configure
>>> --prefix=$SWDEV/MPI/mpich/3.2/nagfor_6.1-gcc_5.3-nomismatchall \
>>      --disable-wrapper-rpath CC=gcc CXX=g++ FC=nagfor F77=nagfor \
>>      CFLAGS='-fpic -m64' CXXFLAGS='-fpic -m64' \
>>      FCFLAGS='-PIC -abi=64' FFLAGS='-PIC -abi=64 -mismatch' \
>>      --enable-fortran=all --enable-cxx
>>
>> I got something to build. Huzzah!
>>
>> I then tried the cpi test, it worked! It even detected I was on slurm
>> according to mpirun -verbose.
>>
>> I then tried a simple Fortran 90 Hello world program and...crash:
>>
>>> (1211)(master) $ cat helloWorld.F90
>>> program hello_world
>>>
>>>    use mpi
>>>
>>>    implicit none
>>>
>>>    integer :: comm
>>>    integer :: myid, npes, ierror
>>>    integer :: name_length
>>>
>>>    character(len=MPI_MAX_PROCESSOR_NAME) :: processor_name
>>>
>>>    call mpi_init(ierror)
>>>
>>>    comm = MPI_COMM_WORLD
>>>
>>>    call MPI_Comm_Rank(comm,myid,ierror)
>>>    call MPI_Comm_Size(comm,npes,ierror)
>>>    call MPI_Get_Processor_Name(processor_name,name_length,ierror)
>>>
>>>    write (*,'(A,1X,I4,1X,A,1X,I4,1X,A,1X,A)') "Process", myid, "of",
>>> npes, "is on", trim(processor_name)
>>>
>>>    call MPI_Finalize(ierror)
>>>
>>> end program hello_world
>>> (1212)(master) $ mpifort -o helloWorld.exe helloWorld.F90
>>> NAG Fortran Compiler Release 6.1(Tozai) Build 6113
>>> [NAG Fortran Compiler normal termination]
>>> (1213)(master) $ mpirun -np 4 ./helloWorld.exe
>>> srun.slurm: cluster configuration lacks support for cpu binding
>>> Runtime Error: Buffer overflow on output
>>> Program terminated by I/O error on unit 6
>>> (Output_Unit,Unformatted,Direct)
>>> Runtime Error: Buffer overflow on output
>>> Program terminated by I/O error on unit 6
>>> (Output_Unit,Unformatted,Direct)
>>> Runtime Error: Buffer overflow on output
>>> Program terminated by I/O error on unit Runtime Error: Buffer overflow
>>> on output
>>> Program terminated by I/O error on unit 6
>>> (Output_Unit,Unformatted,Direct)
>>> 6 (Output_Unit,Unformatted,Direct)
>>>
>>> ===================================================================================
>>>
>>>
>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>> =   PID 7642 RUNNING AT borgl189
>>> =   EXIT CODE: 134
>>> =   CLEANING UP REMAINING PROCESSES
>>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>>> ===================================================================================
>>>
>>>
>>> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6)
>>> This typically refers to a problem with your application.
>>> Please see the FAQ page for debugging suggestions
>>
>> Weird. So, I decided to try something different, this program:
>>
>>> program main
>>>    implicit none
>>>    real :: a
>>>    a = 1240.0
>>>    write (*,*) "Hello world", a
>>> end program main
>>
>> Looks boring and is standard-compliant and nagfor likes it:
>>
>>> (1226) $ nagfor test.F90 && ./a.out
>>> NAG Fortran Compiler Release 6.1(Tozai) Build 6113
>>> [NAG Fortran Compiler normal termination]
>>>  Hello world   1.2400000E+03
>>
>> Looks correct. Now let's try mpifort:
>>
>>> (1232) $ mpifort test.F90 && ./a.out
>>> NAG Fortran Compiler Release 6.1(Tozai) Build 6113
>>> [NAG Fortran Compiler normal termination]
>>>  Hello world
>>> Segmentation fault (core dumped)
>>
>> You can't really see it here, but that "Hello world" is surrounded by LF
>> characters. Like a literal LineFeed...and then it core dumps.
>>
>> Now let's try running with mpirun as well:
>>
>>> (1233) $ mpifort test.F90 && mpirun -np 1 ./a.out
>>> NAG Fortran Compiler Release 6.1(Tozai) Build 6113
>>> [NAG Fortran Compiler normal termination]
>>> srun.slurm: cluster configuration lacks support for cpu binding
>>>
>>> ===================================================================================
>>>
>>>
>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>> =   PID 8520 RUNNING AT borgl189
>>> =   EXIT CODE: 139
>>> =   CLEANING UP REMAINING PROCESSES
>>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>>> ===================================================================================
>>>
>>>
>>> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
>>> (signal 11)
>>> This typically refers to a problem with your application.
>>> Please see the FAQ page for debugging suggestions
>>
>> All righty then.
>>
>> Does anyone have advice for this? I'll fully accept I configured MPICH3
>> wrong as it's the first time in a while I've built MPICH (think MPICH2).
>> But, still, I don't have any exciting flags.
>>
>> Matt
>>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss


-- 
Matt Thompson, SSAI, Sr Scientific Programmer/Analyst
NASA GSFC,    Global Modeling and Assimilation Office
Code 610.1,  8800 Greenbelt Rd,  Greenbelt,  MD 20771
Phone: 301-614-6712                 Fax: 301-614-6246
http://science.gsfc.nasa.gov/sed/bio/matthew.thompson
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list