[mpich-discuss] MPI_Allreduce is slow for 7 processes or more

Junchao Zhang jczhang at mcs.anl.gov
Mon May 4 11:28:12 CDT 2015


Could you try mpich3.2b2?  I tested your code with it on my laptop. My
timing is

np    time(s)
2      102.57
4      51.285
8      25.6425
16   12.8212

--Junchao Zhang

On Mon, May 4, 2015 at 10:14 AM, David Froger <david.froger.ml at mailoo.org>
wrote:

> Thanks Junchao.
>
> > I don't see you measure MPI_Allreduce.
>
> You're right, let's call my code "a simple example to reproduce a bug"
> rather
> than a benchmark.
>
> > Basically you only measured some random > numbers across processes.
>
> The usleep simulate the time to perform computation in my real code
> (Computational Fluid Dynamic software). bench_mpi.cxx only do a
> usleep(microseconds) then call MPI_Allreduce. microseconds is a constant
> base
> time, divised by mpi_size (+ a random overhead between 0% and 5%, so that
> MPI_Allreduce is not called at the same wall clock on all proceses, but I
> thing a should have use a different seed on each proc).
>
> So because what the code do is only usleep(base_time / mpi_size), I expect
> the
> wall clock time to be half with twice processor.
>
> With MPiCH 3.1.4, the wall clock time increase with 7 or more processes.
> MPI_Allreduce become very slow without a reason. I'm triying to understand
> why.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150504/6d035005/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list