[mpich-discuss] MPI_Allreduce is slow for 7 processes or more
Junchao Zhang
jczhang at mcs.anl.gov
Mon May 4 11:28:12 CDT 2015
Could you try mpich3.2b2? I tested your code with it on my laptop. My
timing is
np time(s)
2 102.57
4 51.285
8 25.6425
16 12.8212
--Junchao Zhang
On Mon, May 4, 2015 at 10:14 AM, David Froger <david.froger.ml at mailoo.org>
wrote:
> Thanks Junchao.
>
> > I don't see you measure MPI_Allreduce.
>
> You're right, let's call my code "a simple example to reproduce a bug"
> rather
> than a benchmark.
>
> > Basically you only measured some random > numbers across processes.
>
> The usleep simulate the time to perform computation in my real code
> (Computational Fluid Dynamic software). bench_mpi.cxx only do a
> usleep(microseconds) then call MPI_Allreduce. microseconds is a constant
> base
> time, divised by mpi_size (+ a random overhead between 0% and 5%, so that
> MPI_Allreduce is not called at the same wall clock on all proceses, but I
> thing a should have use a different seed on each proc).
>
> So because what the code do is only usleep(base_time / mpi_size), I expect
> the
> wall clock time to be half with twice processor.
>
> With MPiCH 3.1.4, the wall clock time increase with 7 or more processes.
> MPI_Allreduce become very slow without a reason. I'm triying to understand
> why.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150504/6d035005/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list