[mpich-discuss] _get_addr error while running application using MPICH

Zhou, Hui zhouh at anl.gov
Mon Nov 19 11:14:33 CST 2018


Hi Zhifeng,

We just had a new mpich release: mpich-3.3rc1. You may try that 
release see if you still have the same error. 

That aside, does your code uses MPI_T_ interfaces? You may try search 
MPI_T_ prefixes in your code base. In particular, I am interested in 
any MPI_T_ calls before MPI_Init call.

-- 
Hui Zhou

On Mon, Nov 19, 2018 at 10:39:20AM -0500, Zhifeng Yang wrote:
>Hi Hui,
>Here are the outputs. I tried the following commands
>mpirun --version
>./cpi
>mpirun ./cpi
>mpirun -np 1 ./cpi
>
>[vy57456 at maya-usr1 em_real]$mpirun --version
>HYDRA build details:
>    Version:                                 3.2.1
>    Release Date:                            Fri Nov 10 20:21:01 CST 2017
>    CC:                              gcc
>    CXX:                             g++
>    F77:                             gfortran
>    F90:                             gfortran
>    Configure options:                       '--disable-option-checking'
>'--prefix=/umbc/xfs1/zzbatmos/users/vy57456/application/gfortran/mpich-3.2.1'
>'CC=gcc' 'CXX=g++' 'FC=gfortran' 'F77=gfortran' '--cache-file=/dev/null'
>'--srcdir=.' 'CFLAGS= -O2' 'LDFLAGS=' 'LIBS=-lpthread ' 'CPPFLAGS=
>-I/home/vy57456/zzbatmos_user/application/gfortran/source_code/mpich-3.2.1/src/mpl/include
>-I/home/vy57456/zzbatmos_user/application/gfortran/source_code/mpich-3.2.1/src/mpl/include
>-I/home/vy57456/zzbatmos_user/application/gfortran/source_code/mpich-3.2.1/src/openpa/src
>-I/home/vy57456/zzbatmos_user/application/gfortran/source_code/mpich-3.2.1/src/openpa/src
>-D_REENTRANT
>-I/home/vy57456/zzbatmos_user/application/gfortran/source_code/mpich-3.2.1/src/mpi/romio/include'
>'MPLLIBNAME=mpl'
>    Process Manager:                         pmi
>    Launchers available:                     ssh rsh fork slurm ll lsf sge
>manual persist
>    Topology libraries available:            hwloc
>    Resource management kernels available:   user slurm ll lsf sge pbs
>cobalt
>    Checkpointing libraries available:
>    Demux engines available:                 poll select
>
>
>[vy57456 at maya-usr1 examples]$./cpi
>Process 0 of 1 is on maya-usr1
>pi is approximately 3.1415926544231341, Error is 0.0000000008333410
>wall clock time = 0.000066
>
>
>[vy57456 at maya-usr1 examples]$mpirun ./cpi
>Process 0 of 1 is on maya-usr1
>pi is approximately 3.1415926544231341, Error is 0.0000000008333410
>wall clock time = 0.000095
>
>[vy57456 at maya-usr1 examples]$mpirun -np 1 ./cpi
>Process 0 of 1 is on maya-usr1
>pi is approximately 3.1415926544231341, Error is 0.0000000008333410
>wall clock time = 0.000093
>
>There is no error.
>
>Zhifeng
>
>
>On Mon, Nov 19, 2018 at 10:33 AM Zhou, Hui <zhouh at anl.gov> wrote:
>
>> On Mon, Nov 19, 2018 at 10:14:54AM -0500, Zhifeng Yang wrote:
>> >Thank you for helping me on this error. Actually, real.exe is a portion of
>> >a very large weather model. It is very difficult to extract it or
>> duplicate
>> >the error in a simple fortran code, since I am not sure where the problem
>> >is. From your discussion, I barely can understand them, in fact. Even I do
>> >not know what is "_get_addr". Is it related to MPI?
>>
>> It is difficult to pin-point the problem without reproducing it.
>>
>> Anyway, let's start with mpirun. What is your output if you try:
>>
>>     mpirun --version
>>
>> Next, what is your mpich version? If you built mpich, locate the `cpi`
>> program in the examples folder and try `./cpi` and `mpirun ./cpi`. Do
>> you have error?
>>
>> --
>> Hui Zhou
>>



More information about the discuss mailing list