<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi Eric,<br>
<br>
1 hour is normal for running the entire MPICH test suite. It looks
like your failures are either dynamic process (e.g., spawn) or fault
tolerance related tests. They should be irrelevant to the PETSc
issue.<br>
<br>
Min<br>
<div class="moz-cite-prefix">On 2018/03/27 21:09, Eric Chamberland
wrote:<br>
</div>
<blockquote type="cite" cite="mid:7c2ecc9c-184f-5ae6-18bb-bf595a7df19f@giref.ulaval.ca">
<p>Hi Min and Matthew,</p>
<p>In fact, I just ran an "hello word" on 2 processes and it
works. I do not have a more complicated example without PETSc
since I have a Petsc-based source code...</p>
<p>However, I just tried to launch "make testing" into the mpich
directory and it ended with some failed tests and it was very
long: about an hour. Is it normal?<br>
</p>
<p>Please, see the attached file: summary.xml</p>
<p>Thanks,</p>
<p>Eric</p>
<p><br>
</p>
<br>
<p><br>
</p>
<br>
<div class="moz-cite-prefix">On 27/03/18 05:04 PM, Matthew Knepley
wrote:<br>
</div>
<blockquote type="cite" cite="mid:CAMYG4G=_U5HVuPKNCyX2Vu+EpuLUWaRgQFKsNabpzZV2jJ97Zw@mail.gmail.com">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On Tue, Mar 27, 2018 at 4:59 PM,
Min Si <span dir="ltr"><<a href="mailto:msi@anl.gov" target="_blank" moz-do-not-send="true">msi@anl.gov</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Hi
Eric,<br>
<br>
It will be great if you could give us a simple MPI
program (not with PETSc) to reproduce this issue. If
this is a problem happens only when PETSc is involved,
the PETSc team can give you more suggestions.<br>
</blockquote>
<div><br>
</div>
<div>Hi Min,</div>
<div><br>
</div>
<div>It is really easy to run PETSc at ANL. I am sure one
of us can help if you cannot reproduce this bug on your
own.</div>
<div><br>
</div>
<div> Thanks,</div>
<div><br>
</div>
<div> Matt</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
Thanks,<br>
Min
<div>
<div class="h5"><br>
<br>
On 2018/03/27 15:38, Eric Chamberland wrote:<br>
</div>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div class="h5"> Hi,<br>
<br>
since more than 2 weeks that the master branch of
mpich is still and it can be reproduced with a
simple "make test" after a fresh installation of
PETSc...<br>
<br>
Is anyone testing it?<br>
<br>
Is it supposed to be working?<br>
<br>
Just tell me if I should "follow" another mpich
branch please.<br>
<br>
Thanks,<br>
<br>
Eric<br>
<br>
<br>
On 14/03/18 03:35 AM, Eric Chamberland wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0
0 .8ex;border-left:1px #ccc
solid;padding-left:1ex"> Hi,<br>
<br>
fwiw, the actual mpich/master branch doesn't
passes the PETSc "make test" after a fresh
installation... It hangs just afer the 1 MPI
process test, meaning it is locked into the 2
process test:<br>
<br>
make PETSC_DIR=/pmi/cmpbib/compilat<wbr>ion_BIB_dernier_mpich/COMPILE_<wbr>AUTO/mpich-3.x-debug/petsc-3.<wbr>8.3-debug
PETSC_ARCH=arch-linux2-c-debug test<br>
Running test examples to verify correct
installation<br>
Using PETSC_DIR=/pmi/cmpbib/compilat<wbr>ion_BIB_dernier_mpich/COMPILE_<wbr>AUTO/mpich-3.x-debug/petsc-3.<wbr>8.3-debug
and PETSC_ARCH=arch-linux2-c-debug<br>
C/C++ example src/snes/examples/tutorials/ex<wbr>19
run successfully with 1 MPI process<br>
<br>
<br>
<br>
<br>
^Cmakefile:151: recipe for target 'test' failed<br>
make: [test] Interrupt (ignored)<br>
<br>
thanks,<br>
<br>
Eric<br>
<br>
On 13/03/18 08:07 AM, Eric Chamberland wrote:<br>
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex"> <br>
Hi,<br>
<br>
each night we are testing mpich/master with
our petsc-based code. I don't know if PETSc
team is doing the same thing with
mpich/master? (Maybe it is a good idea?)<br>
<br>
Everything was fine (except the issue <a href="https://github.com/pmodels/mpich/issues/2892" rel="noreferrer" target="_blank" moz-do-not-send="true">https://github.com/pmodels/mpi<wbr>ch/issues/2892</a>)
up to commit 7b8d64debd, but since commit
mpich:a8a2b30fd21), I have a segfault on a any
parallel nightly test.<br>
<br>
For example, a 2 process test ends at almost
different execution points:<br>
<br>
rank 0:<br>
<br>
#003: /lib64/libpthread.so.0(+0xf870<wbr>)
[0x7f25bf908870]<br>
#004: /pmi/cmpbib/compilation_BIB_de<wbr>rnier_mpich/COMPILE_AUTO/BIB/<wbr>bin/BIBMEFGD.opt()
[0x64a788]<br>
#005: /lib64/libc.so.6(+0x35140)
[0x7f25bca18140]<br>
#006: /lib64/libc.so.6(__poll+0x2d)
[0x7f25bcabfbfd]<br>
#007: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x1e4cc9)
[0x7f25bd90ccc9]<br>
#008: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x1ea55c)
[0x7f25bd91255c]<br>
#009: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0xba657)
[0x7f25bd7e2657]<br>
#010: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(PMPI_Waitall+0xe3)
[0x7f25bd7e3343]<br>
#011: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(P<wbr>etscGatherMessageLengths+0x654<wbr>)
[0x7f25c4bb3193]<br>
#012: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecScatterCreate_PtoS+0x859)
[0x7f25c4e82d7f]<br>
#013: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecScatterCreate+0x5684)
[0x7f25c4e4d055]<br>
#014: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecCreateGhostWithArray+0x688)
[0x7f25c4e01a39]<br>
#015: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecCreateGhost+0x179)
[0x7f25c4e020f6]<br>
<br>
rank 1:<br>
<br>
#002: /pmi/cmpbib/compilation_BIB_de<wbr>rnier_mpich/COMPILE_AUTO/GIREF<wbr>/lib/libgiref_opt_Util.so(<wbr>traitementSignal+0x2bd0)
[0x7f62df8e7310]<br>
#003: /lib64/libc.so.6(+0x35140)
[0x7f62d3bc9140]<br>
#004: /lib64/libc.so.6(__poll+0x2d)
[0x7f62d3c70bfd]<br>
#005: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x1e4cc9)
[0x7f62d4abdcc9]<br>
#006: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x1ea55c)
[0x7f62d4ac355c]<br>
#007: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x12c9c5)
[0x7f62d4a059c5]<br>
#008: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x12e102)
[0x7f62d4a07102]<br>
#009: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0xf17a1)
[0x7f62d49ca7a1]<br>
#010: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x3facf)
[0x7f62d4918acf]<br>
#011: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x3fc3d)
[0x7f62d4918c3d]<br>
#012: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0xf18d8)
[0x7f62d49ca8d8]<br>
#013: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x3fb88)
[0x7f62d4918b88]<br>
#014: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x3fc3d)
[0x7f62d4918c3d]<br>
#015: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(MPI_Barrier+0x27b)
[0x7f62d4918edb]<br>
#016: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(P<wbr>etscCommGetNewTag+0x3ff)
[0x7f62dbceb055]<br>
#017: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(P<wbr>etscObjectGetNewTag+0x15d)
[0x7f62dbceaadb]<br>
#018: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecScatterCreateCommon_PtoS+0x1<wbr>ee)
[0x7f62dc03625c]<br>
#019: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecScatterCreate_PtoS+0x29c4)
[0x7f62dc035eea]<br>
#020: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecScatterCreate+0x5684)
[0x7f62dbffe055]<br>
#021: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecCreateGhostWithArray+0x688)
[0x7f62dbfb2a39]<br>
#022: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecCreateGhost+0x179)
[0x7f62dbfb30f6]<br>
<br>
Have some other users (PETSc users?) reported
problem?<br>
<br>
Thanks,<br>
<br>
Eric<br>
<br>
ps: usual informations:<br>
<br>
mpich logs:<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.log" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_config.log</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.system" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_config.system</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpich_version.txt" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_mpich_version.txt</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_c.txt" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_c.txt</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_m.txt" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_m.txt</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_mi.txt" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_mi.txt</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_openmpa_config.log" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_openmpa_config.<wbr>log</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpl_config.log" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_mpl_config.log</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_config.log" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_pm_hydra_config.<wbr>log</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_tools_topo_config.log" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_pm_hydra_tools_<wbr>topo_config.log</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpiexec_info.txt" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_mpiexec_info.txt</a>
<br>
<br>
Petsc logs:<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_configure.log" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_configure.log</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_make.log" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_make.log</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_default.log" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_default.log</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_RDict.log" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_RDict.log</a>
<br>
<a href="http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_CMakeLists.txt" rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_CMakeLists.txt</a>
<br>
<br>
<br>
</blockquote>
<br>
</blockquote>
</div>
</div>
______________________________<wbr>_________________<br>
discuss mailing list <a href="mailto:discuss@mpich.org" target="_blank" moz-do-not-send="true">discuss@mpich.org</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.mpich.org/mailma<wbr>n/listinfo/discuss</a><br>
</blockquote>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div class="gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>What most experimenters take for granted before
they begin their experiments is infinitely more
interesting than any results to which their
experiments lead.<br>
-- Norbert Wiener</div>
<div><br>
</div>
<div><a href="http://www.caam.rice.edu/%7Emk51/" target="_blank" moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>