[mpich-discuss] Recommended configure options for MPICH 4.3.x with Valgrind or address-sanitizer

Eric Chamberland Eric.Chamberland at giref.ulaval.ca
Thu Oct 2 08:33:20 CDT 2025


Hi,

I have been building MPICH with the following configure options for a 
long time, mainly to keep my code “Valgrind-clean”:

===
./configure \
   --enable-g=dbg,meminit \
   --with-device=ch3:sock \
   --enable-romio

===


This setup worked reasonably well in the past, but recently I’ve been 
seeing occasional errors with address-sanitizer or valgrind (with 4.3.0 
on a single node) such as:


===

Fatal error in internal_Allreduce_c: Unknown error class, error stack:

internal_Allreduce_c(347)...................: 
MPI_Allreduce_c(sendbuf=0x7ffdeb0b8e90, recvbuf=0x7ffdeb0b8e98, count=1, 
dtype=0x4c00083a, MPI_SUM, comm=0x84000003) failed

MPIR_Allreduce_impl(4826)...................:

MPIR_Allreduce_allcomm_auto(4732)...........:

MPIR_Allreduce_intra_recursive_doubling(115):

MPIC_Sendrecv(266)..........................:

MPIC_Wait(90)...............................:

MPIR_Wait(751)..............................:

MPIR_Wait_state(708)........................:

MPIDI_CH3i_Progress_wait(187)...............: an error occurred while 
handling an event returned by MPIDI_CH3I_Sock_Wait()

MPIDI_CH3I_Progress_handle_sock_event(385)..:

MPIDI_CH3I_Socki_handle_read(3647)..........: connection failure 
(set=0,sock=1,errno=104:Connection reset by peer)

===


Is CH3 considered legacy?


I would like to also ask:

  1. What are the recommended configure options in 2025 for building 
MPICH in a way that works well with Valgrind?

  2. Is it preferable now to move to CH4 (e.g. ch4:ofi or ch4:shm) when 
debugging with Valgrind?

  3. Are there any other options (besides --enable-g=dbg,meminit) that 
you would suggest for catching memory errors while keeping Valgrind 
reports as clean as possible?


  4. Is 
https://urldefense.us/v3/__https://github.com/pmodels/mpich/blob/main/doc/wiki/design/Support_for_Debugging_Memory_Allocation.md__;!!G_uCfscf7eWS!dFuaQZT98RM3djlwTDR166TzLQA81Eo6KgJexqv-7tSs0ksEgsw5P7-HubGwLwtV3eIycYeq5yGo0flX2JqpDUTLrun1lyOw$  
<https://urldefense.com/v3/__https:/github.com/pmodels/mpich/blob/main/doc/wiki/design/Support_for_Debugging_Memory_Allocation.md__;!!KGKeukY!0uZHEHtZEaga1beOpdYFXpq7WNGp5jNAW8wQaJk8wgYLGwAEf-QD8rrTOQF7SYFYfdxC1lVvpP3XqxhRMeBGqXCTDdN2eE6IFMZP04X4lX-e$ > 
up-to-date?


Any guidance on the “best practice” configuration for this use case 
would be greatly appreciated.


PETSc guys have some options about debug 
(https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/config/BuildSystem/config/packages/MPICH.py*L94__;Iw!!G_uCfscf7eWS!dFuaQZT98RM3djlwTDR166TzLQA81Eo6KgJexqv-7tSs0ksEgsw5P7-HubGwLwtV3eIycYeq5yGo0flX2JqpDUTLrlIPsrgu$  
<https://urldefense.com/v3/__https:/gitlab.com/petsc/petsc/-/blob/main/config/BuildSystem/config/packages/MPICH.py*L94__;Iw!!KGKeukY!0uZHEHtZEaga1beOpdYFXpq7WNGp5jNAW8wQaJk8wgYLGwAEf-QD8rrTOQF7SYFYfdxC1lVvpP3XqxhRMeBGqXCTDdN2eE6IFMZP07bPQizu$ >) 
but still uses CH3 by default.  However Satish uses the configuration 
described above, at least for valgrind CI.


Thanks a lot,


Eric

-- 

Eric Chamberland, ing., M. Ing

Professionnel de recherche

GIREF/Université Laval


On 2025-10-02 08:13, Balay, Satish wrote:
> We currently use:
>
> balay at petsc-gpu-02:~$ 
> /nfs/gce/projects/petsc/soft/u22.04/mpich-4.3.0-p2-ucx/bin/mpichversion
> MPICH Version:      4.3.0
> MPICH Release date: Mon Feb  3 09:09:47 AM CST 2025
> MPICH ABI:          17:0:5
> MPICH Device:       ch4:ucx
> MPICH configure: 
>  --prefix=/nfs/gce/projects/petsc/soft/u22.04/mpich-4.3.0-p2-ucx 
> --enable-shared --with-device=ch4:ucx --with-pm=hydra --enable-fast=no 
> --enable-error-messages=all --enable-g=meminit --disable-java 
> --without-hwloc --disable-opencl --without-cuda --without-hip
> MPICH CC:           gcc     -O0
> MPICH CXX:          g++   -O0
> MPICH F77:          gfortran   -O0
> MPICH FC:           gfortran   -O0
> MPICH features:     threadcomm
>
> With:
>
>     #MPICH OFI/UCX/valgrind
>     export FI_PROVIDER=^psm3
>     export UCX_SYSV_HUGETLB_MODE=n
>     export UCX_LOG_LEVEL=error
>
> Satish
> ------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20251002/b0f38957/attachment.html>


More information about the discuss mailing list