[mpich-discuss] Error with mpich-3.3.2 and ucx-1.8.0
Raffenetti, Kenneth J.
raffenet at mcs.anl.gov
Thu Mar 4 11:37:24 CST 2021
Hi Junchao,
We do not have any plans to provide another MPICH release in the 3.3.x series. It is good to know this bug is fixed in the latest version of MPICH.
Ken
On 3/2/21, 8:17 PM, "Junchao Zhang via discuss" <discuss at mpich.org> wrote:
Hello, We met an error with mpich-3.3.2 and ucx-1.8.0. See the attached example, which uses a user defined data type in MPI_Startall. MPICH was configured with --with-device=ch4:ucx --with-ucx=/path/to/ucx-1.8.0. The error stack is
$ mpirun -n 2 ./dtype
Assertion failed in file src/mpi/datatype/type_free.c at line 38: (((datatype_ptr)))->ref_count >= 0
/home/jczhang/soft/lib/libmpi.so.12(+0x48fa1f) [0x7f4019b6ba1f]
/home/jczhang/soft/lib/libmpi.so.12(MPL_backtrace_show+0x18) [0x7f4019b6baff]
/home/jczhang/soft/lib/libmpi.so.12(+0x441e27) [0x7f4019b1de27]
/home/jczhang/soft/lib/libmpi.so.12(+0xfebc2) [0x7f40197dabc2]
/home/jczhang/soft/lib/libmpi.so.12(MPI_Type_free+0x5fd) [0x7f40197db239]
It does not happen with mpich-3.4.1.
--Junchao Zhang
More information about the discuss
mailing list