[mpich-discuss] possible thread bug with MPI_Reduce/MPI_Allreduce

Burlen Loring burlen.loring at gmail.com
Wed Jul 12 13:29:04 CDT 2023


Hi All,

I'm using MPICH 4.0.2 on Fedora 37 from the package manager for
development. From an MPI parallel  simulation I'm spawning a thread that
does a number of reductions (MPI_Allreduce and MPI_Reduce). MPI_IN_PLACE
option is used. The results are written with POSIX I/O from rank 0. The
simulation continues, and can launch the next set of reductions before the
previous ones completed. I have called MPI_Init_thread and requested and
received MPI_THREAD_MULTIPLE support.

However, when multiple threads overlap (in test runs 3-4 threads running
concurrently) both MPI_Allreduce and MPI_Reduce calls can produce incorrect
results. If instead I serialize the threads, by waiting on them before
returning to the simulation, the results are correct. Also, if I use a
mutex around my MPI_Allreduce/Reduce sections, the results are correct. I
think that MPI_Reduce/Allreduce is not thread safe.

I was wondering if this is a known issue? Could it be a mpich
build/configure setting not set correctly by the Fedora package maintainer
?

Thanks
Burlen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20230712/5edd7951/attachment.html>


More information about the discuss mailing list