<!DOCTYPE html><html><head><title></title><style type="text/css">p.MsoNormal,p.MsoNoSpacing{margin:0}</style></head><body><div style="font-family:Arial;"><br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;"><br></div><div>On Tue, May 5, 2020, at 8:38 AM, hritikesh semwal via discuss wrote:<br></div><blockquote type="cite" id="qt" style=""><div dir="ltr"><div>Hello all,<br></div><div><br></div><div>I am working on the development of a parallel CFD solver and I am using MPI_Allreduce for the global summation of the local errors calculated on all processes of a group and the summation is to be used by all the processes. My concern is that MPI_Allreduce is taking almost 27-30% of the total time used, which is a significant amount. So, I want to ask if anyone can suggest me better alternative/s to replace MPI_Allreduce which can reduce the time consumption.<br></div><div><br></div><div>Thank you.<br></div></div><div>_______________________________________________<br></div><div>discuss mailing list discuss@mpich.org<br></div><div>To manage subscription options or unsubscribe:<br></div><div>https://lists.mpich.org/mailman/listinfo/discuss<br></div><div><br></div></blockquote><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Hi Hitesh,<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">What hardware are you running on and what is the interconnect?<br></div><div style="font-family:Arial;">Have you tried changing any of the MPI settings?<br></div><div style="font-family:Arial;">Can the reduction be done asynchronously?<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Regards,<br></div><div style="font-family:Arial;">Benson</div></body></html>