[mpich-discuss] [petsc-dev] Is mpich/master:a8a2b30fd21 tested with Petsc?
Min Si
msi at anl.gov
Thu Apr 19 09:51:34 CDT 2018
Hi Junchao,
This is a great idea. We will add large tag tests in our test suite !
Min
On 2018/04/17 18:17, Junchao Zhang wrote:
> Min,
> I suggest MPICH add tests to play with the maximal MPI tag (through
> attribute MPI_TAG_UB).
> PETSc uses tags from the maximal and downwards. I guess MPICH tests
> use small tags. That is why the bug only showed up with PETSc.
>
> --Junchao Zhang
>
> On Tue, Apr 17, 2018 at 3:58 PM, Min Si <msi at anl.gov
> <mailto:msi at anl.gov>> wrote:
>
> Hi all,
>
> Thanks for narrowing down the problem. I checked the MPICH code
> and believe this is a bug in MPICH. I just created a PR to fix it:
> https://github.com/pmodels/mpich/pull/3097
> <https://github.com/pmodels/mpich/pull/3097>
>
> It should be merged into MPICH master branch soon.
>
> Thanks,
> Min
>
>
> On 2018/04/17 14:10, Eric Chamberland wrote:
>
> Hi,
>
> are we talking about the "tag" passed to MPI_Isend for example?
>
> but does that mean there is something to change for any MPI
> call which involves tags usage or is it only a PETSc "bad" tag
> usage?
>
> thanks Satish for your finding!
>
> Eric
>
> On 16/04/18 11:31 PM, Satish Balay wrote:
>
> On Tue, 13 Mar 2018, Eric Chamberland wrote:
>
> Hi,
>
> each night we are testing mpich/master with our
> petsc-based code. I don't
> know if PETSc team is doing the same thing with
> mpich/master? (Maybe it is a
> good idea?)
>
> Everything was fine (except the issue
> https://github.com/pmodels/mpich/issues/2892
> <https://github.com/pmodels/mpich/issues/2892>) up to
> commit 7b8d64debd, but
> since commit mpich:a8a2b30fd21), I have a segfault on
> a any parallel nightly
> test.
>
>
> I attempted a bisect of the above range of commits - and
> narrowed down to:
>
>
> db11d4c4a70e39a28be88ed32f00542301699e08 is the first bad
> commit
> <<<<<<<
>
>
> balay at asterix /home/balay/soft/build/mpich
> ((db11d4c4a...)|BISECTING)
> $ git show db11d4c4a70e39a28be88ed32f00542301699e08
> commit db11d4c4a70e39a28be88ed32f00542301699e08 (HEAD,
> refs/bisect/bad)
> Author: Ken Raffenetti <raffenet at mcs.anl.gov
> <mailto:raffenet at mcs.anl.gov>>
> Date: Thu Feb 15 11:37:59 2018 -0600
>
> init: Fix tag upper limit initialization
> The starting point for this value is equivalent
> to the usable tag bits
> macro. This value should be set before device
> initialization,
> otherwise devices will assume they have more bits
> than are actually
> available.
> Signed-off-by: Wesley Bland
> <wesley.bland at intel.com <mailto:wesley.bland at intel.com>>
>
> diff --git a/src/mpi/init/initthread.c
> b/src/mpi/init/initthread.c
> index cbc41f4d5..b31ae2f07 100644
> --- a/src/mpi/init/initthread.c
> +++ b/src/mpi/init/initthread.c
> @@ -403,7 +403,7 @@ int MPIR_Init_thread(int *argc, char
> ***argv, int required, int *provided)
> MPIR_Process.attrs.host = MPI_PROC_NULL;
> MPIR_Process.attrs.io <http://MPIR_Process.attrs.io> =
> MPI_PROC_NULL;
> MPIR_Process.attrs.lastusedcode = MPI_ERR_LASTCODE;
> - MPIR_Process.attrs.tag_ub = 0;
> + MPIR_Process.attrs.tag_ub = MPIR_TAG_USABLE_BITS;
> MPIR_Process.attrs.universe =
> MPIR_UNIVERSE_SIZE_NOT_SET;
> MPIR_Process.attrs.wtime_is_global = 0;
> @@ -531,13 +531,6 @@ int MPIR_Init_thread(int *argc,
> char ***argv, int required, int *provided)
> MPIR_Assert(((unsigned) MPIR_Process.
> attrs.tag_ub & ((unsigned)
> MPIR_Process.attrs.tag_ub + 1)) == 0);
> - /* Set aside tag space for tagged collectives and
> failure notification */
> -#ifdef HAVE_TAG_ERROR_BITS
> - MPIR_Process.attrs.tag_ub >>= 3;
> -#else
> - MPIR_Process.attrs.tag_ub >>= 1;
> -#endif
> -
> /* Assert: tag_ub is at least the minimum asked for
> in the MPI spec */
> MPIR_Assert(MPIR_Process.attrs.tag_ub >= 32767);
> <<<<<<<<<<<<<<<<<
>
> Reverthing this patch gets mpich-3.3b2 working with petsc
>
> Satish
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20180419/9f2e2e37/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list