[mpich-discuss] Building CUDA-aware MPICH fails on Ubuntu 20.04

Raffenetti, Kenneth J. raffenet at mcs.anl.gov
Wed Jun 2 16:04:28 CDT 2021


Hi Omlin,

This is a known issue that was fixed but unfortunately not included in MPICH 3.4.2. A workaround would be to change /bin/sh to link to /bin/bash on your system. Or you could apply this patch to /home/omlins/soft/mpich-3.4.2/modules/yaksa/src/backend/cuda/cudalt.sh:

  https://github.com/pmodels/yaksa/pull/181/commits/eed193d9775dd0f33cbd8caa0dd946647b751b18

Ken

On 6/2/21, 12:54 PM, "Omlin Samuel via discuss" <discuss at mpich.org> wrote:

    Dear all,
    
    Building CUDA-aware MPICH fails for me after a few seconds on Ubuntu 20.04 following the MPICH installation guide (CUDA Driver Version: 465.19.01,
    CUDA Version: 11.3).
    
    Reproducer:
    mkdir ~/soft
    cd ~/soft
    tar -xzf mpich-3.4.2.tar.gz
    mkdir /tmp/mpich-3.4.2
    cd /tmp/mpich-3.4.2
    ~/soft/mpich-3.4.2/configure --with-cuda=/usr/local/cuda --with-device=ch4:ucx 2>&1 | tee configure.log
    make 2>&1 | tee make.log
    
    
    Error:
    /home/omlins/soft/mpich-3.4.2/modules/yaksa/src/backend/cuda/cudalt.sh: 35: Bad substitution
    make[2]: *** [Makefile:8697: src/backend/cuda/pup/yaksuri_cudai_pup_hvector__Bool.lo] Error 2
    make[2]: Leaving directory '/tmp/mpich-3.4.2/modules/yaksa'
    make[1]: *** [Makefile:43560: all-recursive] Error 1
    make[1]: Leaving directory '/tmp/mpich-3.4.2'
    make: *** [Makefile:11141: all] Error 2
    
    
    
    The with tee created log files 'configure.log' and 'make.log', as well as 'config.log', are attached.
    
    Am I missing something or is there an issue with MPICH here?
    
    
    
    Thanks a lot in advance for helping me solving this issue!
    
    Cheers,
    
    Sam
    
    
    --
    Samuel Omlin, PhD
    Computational Scientist
    CSCS - Swiss National Supercomputing Centre
    ETH Zurich
    Via Trevano 131
    CH-6900 Lugano
    Switzerland



More information about the discuss mailing list