[mpich-discuss] Why stuck in MPI_Finalize?

Erik Schnetter schnetter at gmail.com
Mon Nov 29 12:30:49 CST 2021


I have a Julia test case on macOS where MPICH randomly gets stuck in
MPI_Finalize (with about a 5% chance). See e.g.
https://github.com/JuliaParallel/MPI.jl/runs/4357341818

Can you advise under what circumstances MPICH could get stuck there?
The respective run uses 3 processes, and all 3 processes call into
MPI_Finalize, but no process returns.

I assume that MPI_Finalize contains internally the equivalent to an
MPI_Barrier, but that should succeed here. Are there other actions
taken in MPI_Finalize that would require some kind of consistent state
across the application? For example, if a communicator was created on
all processes, but freed only on some processes, could this cause such
a deadlock?

-erik

-- 
Erik Schnetter <schnetter at gmail.com>
http://www.perimeterinstitute.ca/personal/eschnetter/


More information about the discuss mailing list