[mpich-discuss] [petsc-dev] Is mpich/master:a8a2b30fd21 tested with Petsc?

Eric Chamberland Eric.Chamberland at giref.ulaval.ca
Tue Mar 27 21:09:39 CDT 2018


Hi Min and Matthew,

In fact, I just ran an "hello word" on 2 processes and it works. I do 
not have a more complicated example without PETSc since I have a 
Petsc-based source code...

However, I just tried to launch "make testing" into the mpich directory 
and it ended with some failed tests  and it was very long: about an 
hour.  Is it normal?

Please, see the attached file: summary.xml

Thanks,

Eric





On 27/03/18 05:04 PM, Matthew Knepley wrote:
> On Tue, Mar 27, 2018 at 4:59 PM, Min Si <msi at anl.gov 
> <mailto:msi at anl.gov>> wrote:
>
>     Hi Eric,
>
>     It will be great if you could give us a simple MPI program (not
>     with PETSc) to reproduce this issue. If this is a problem happens
>     only when PETSc is involved, the PETSc team can give you more
>     suggestions.
>
>
> Hi Min,
>
> It is really easy to run PETSc at ANL. I am sure one of us can help if 
> you cannot reproduce this bug on your own.
>
>   Thanks,
>
>      Matt
>
>     Thanks,
>     Min
>
>
>     On 2018/03/27 15:38, Eric Chamberland wrote:
>
>         Hi,
>
>         since more than 2 weeks that the master branch of mpich is
>         still and it can be reproduced with a simple "make test" after
>         a fresh installation of PETSc...
>
>         Is anyone testing it?
>
>         Is it supposed to be working?
>
>         Just tell me if I should "follow" another mpich branch please.
>
>         Thanks,
>
>         Eric
>
>
>         On 14/03/18 03:35 AM, Eric Chamberland wrote:
>
>             Hi,
>
>             fwiw, the actual mpich/master branch doesn't passes the
>             PETSc "make test" after a fresh installation...  It hangs
>             just afer the 1 MPI process test, meaning it is locked
>             into the 2 process test:
>
>             make
>             PETSC_DIR=/pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_AUTO/mpich-3.x-debug/petsc-3.8.3-debug
>             PETSC_ARCH=arch-linux2-c-debug test
>             Running test examples to verify correct installation
>             Using
>             PETSC_DIR=/pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_AUTO/mpich-3.x-debug/petsc-3.8.3-debug
>             and PETSC_ARCH=arch-linux2-c-debug
>             C/C++ example src/snes/examples/tutorials/ex19 run
>             successfully with 1 MPI process
>
>
>
>
>             ^Cmakefile:151: recipe for target 'test' failed
>             make: [test] Interrupt (ignored)
>
>             thanks,
>
>             Eric
>
>             On 13/03/18 08:07 AM, Eric Chamberland wrote:
>
>
>                 Hi,
>
>                 each night we are testing mpich/master with our
>                 petsc-based code. I don't know if PETSc team is doing
>                 the same thing with mpich/master?   (Maybe it is a
>                 good idea?)
>
>                 Everything was fine (except the issue
>                 https://github.com/pmodels/mpich/issues/2892
>                 <https://github.com/pmodels/mpich/issues/2892>) up to
>                 commit 7b8d64debd, but since commit
>                 mpich:a8a2b30fd21), I have a segfault on a any
>                 parallel nightly test.
>
>                 For example, a 2 process test ends at almost different
>                 execution points:
>
>                 rank 0:
>
>                 #003: /lib64/libpthread.so.0(+0xf870) [0x7f25bf908870]
>                 #004:
>                 /pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_AUTO/BIB/bin/BIBMEFGD.opt()
>                 [0x64a788]
>                 #005: /lib64/libc.so.6(+0x35140) [0x7f25bca18140]
>                 #006: /lib64/libc.so.6(__poll+0x2d) [0x7f25bcabfbfd]
>                 #007: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1e4cc9)
>                 [0x7f25bd90ccc9]
>                 #008: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1ea55c)
>                 [0x7f25bd91255c]
>                 #009: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0xba657)
>                 [0x7f25bd7e2657]
>                 #010:
>                 /opt/mpich-3.x_debug/lib/libmpi.so.0(PMPI_Waitall+0xe3)
>                 [0x7f25bd7e3343]
>                 #011:
>                 /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(PetscGatherMessageLengths+0x654)
>                 [0x7f25c4bb3193]
>                 #012:
>                 /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate_PtoS+0x859)
>                 [0x7f25c4e82d7f]
>                 #013:
>                 /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate+0x5684)
>                 [0x7f25c4e4d055]
>                 #014:
>                 /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhostWithArray+0x688)
>                 [0x7f25c4e01a39]
>                 #015:
>                 /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhost+0x179)
>                 [0x7f25c4e020f6]
>
>                 rank 1:
>
>                 #002:
>                 /pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_AUTO/GIREF/lib/libgiref_opt_Util.so(traitementSignal+0x2bd0)
>                 [0x7f62df8e7310]
>                 #003: /lib64/libc.so.6(+0x35140) [0x7f62d3bc9140]
>                 #004: /lib64/libc.so.6(__poll+0x2d) [0x7f62d3c70bfd]
>                 #005: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1e4cc9)
>                 [0x7f62d4abdcc9]
>                 #006: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1ea55c)
>                 [0x7f62d4ac355c]
>                 #007: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x12c9c5)
>                 [0x7f62d4a059c5]
>                 #008: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x12e102)
>                 [0x7f62d4a07102]
>                 #009: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0xf17a1)
>                 [0x7f62d49ca7a1]
>                 #010: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3facf)
>                 [0x7f62d4918acf]
>                 #011: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3fc3d)
>                 [0x7f62d4918c3d]
>                 #012: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0xf18d8)
>                 [0x7f62d49ca8d8]
>                 #013: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3fb88)
>                 [0x7f62d4918b88]
>                 #014: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3fc3d)
>                 [0x7f62d4918c3d]
>                 #015:
>                 /opt/mpich-3.x_debug/lib/libmpi.so.0(MPI_Barrier+0x27b)
>                 [0x7f62d4918edb]
>                 #016:
>                 /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(PetscCommGetNewTag+0x3ff)
>                 [0x7f62dbceb055]
>                 #017:
>                 /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(PetscObjectGetNewTag+0x15d)
>                 [0x7f62dbceaadb]
>                 #018:
>                 /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreateCommon_PtoS+0x1ee)
>                 [0x7f62dc03625c]
>                 #019:
>                 /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate_PtoS+0x29c4)
>                 [0x7f62dc035eea]
>                 #020:
>                 /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate+0x5684)
>                 [0x7f62dbffe055]
>                 #021:
>                 /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhostWithArray+0x688)
>                 [0x7f62dbfb2a39]
>                 #022:
>                 /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhost+0x179)
>                 [0x7f62dbfb30f6]
>
>                 Have some other users (PETSc users?) reported problem?
>
>                 Thanks,
>
>                 Eric
>
>                 ps: usual informations:
>
>                 mpich logs:
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.log
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.log>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.system
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.system>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpich_version.txt
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpich_version.txt>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_c.txt
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_c.txt>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_m.txt
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_m.txt>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mi.txt
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_mi.txt>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_openmpa_config.log
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_openmpa_config.log>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpl_config.log
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpl_config.log>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_config.log
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_config.log>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_tools_topo_config.log
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_tools_topo_config.log>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpiexec_info.txt
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpiexec_info.txt>
>
>
>                 Petsc logs:
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_configure.log
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_configure.log>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_make.log
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_make.log>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_default.log
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_default.log>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_RDict.log
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_RDict.log>
>
>                 http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_CMakeLists.txt
>                 <http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_CMakeLists.txt>
>
>
>
>
>         _______________________________________________
>         discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
>         To manage subscription options or unsubscribe:
>         https://lists.mpich.org/mailman/listinfo/discuss
>         <https://lists.mpich.org/mailman/listinfo/discuss>
>
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/%7Emk51/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20180327/7d8a1ed4/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: summary.xml
Type: text/xml
Size: 160177 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20180327/7d8a1ed4/attachment.xml>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list