[mpich-discuss] possible thread bug with MPI_Reduce/MPI_Allreduce
Burlen Loring
burlen.loring at gmail.com
Wed Jul 12 13:29:04 CDT 2023
Hi All,
I'm using MPICH 4.0.2 on Fedora 37 from the package manager for
development. From an MPI parallel simulation I'm spawning a thread that
does a number of reductions (MPI_Allreduce and MPI_Reduce). MPI_IN_PLACE
option is used. The results are written with POSIX I/O from rank 0. The
simulation continues, and can launch the next set of reductions before the
previous ones completed. I have called MPI_Init_thread and requested and
received MPI_THREAD_MULTIPLE support.
However, when multiple threads overlap (in test runs 3-4 threads running
concurrently) both MPI_Allreduce and MPI_Reduce calls can produce incorrect
results. If instead I serialize the threads, by waiting on them before
returning to the simulation, the results are correct. Also, if I use a
mutex around my MPI_Allreduce/Reduce sections, the results are correct. I
think that MPI_Reduce/Allreduce is not thread safe.
I was wondering if this is a known issue? Could it be a mpich
build/configure setting not set correctly by the Fedora package maintainer
?
Thanks
Burlen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20230712/5edd7951/attachment.html>
More information about the discuss
mailing list