[mpich-discuss] Error with mpich-3.3.2 and ucx-1.8.0
    Raffenetti, Kenneth J. 
    raffenet at mcs.anl.gov
       
    Thu Mar  4 11:37:24 CST 2021
    
    
  
Hi Junchao,
We do not have any plans to provide another MPICH release in the 3.3.x series. It is good to know this bug is fixed in the latest version of MPICH.
Ken
On 3/2/21, 8:17 PM, "Junchao Zhang via discuss" <discuss at mpich.org> wrote:
    Hello,  We met an error with mpich-3.3.2 and ucx-1.8.0.  See the attached example, which uses a user defined data type in MPI_Startall.   MPICH was configured with --with-device=ch4:ucx --with-ucx=/path/to/ucx-1.8.0. The error stack is
    
    $ mpirun -n 2 ./dtype
    Assertion failed in file src/mpi/datatype/type_free.c at line 38: (((datatype_ptr)))->ref_count >= 0
    /home/jczhang/soft/lib/libmpi.so.12(+0x48fa1f) [0x7f4019b6ba1f]
    /home/jczhang/soft/lib/libmpi.so.12(MPL_backtrace_show+0x18) [0x7f4019b6baff]
    /home/jczhang/soft/lib/libmpi.so.12(+0x441e27) [0x7f4019b1de27]
    /home/jczhang/soft/lib/libmpi.so.12(+0xfebc2) [0x7f40197dabc2]
    /home/jczhang/soft/lib/libmpi.so.12(MPI_Type_free+0x5fd) [0x7f40197db239]
    
    
    
    It does not happen with mpich-3.4.1.
     
    --Junchao Zhang
    
    
More information about the discuss
mailing list