[mpich-discuss] MPI_Allreduce is slow for 7 processes or more
David Froger
david.froger.ml at mailoo.org
Mon May 4 10:14:42 CDT 2015
Thanks Junchao.
> I don't see you measure MPI_Allreduce.
You're right, let's call my code "a simple example to reproduce a bug" rather
than a benchmark.
> Basically you only measured some random > numbers across processes.
The usleep simulate the time to perform computation in my real code
(Computational Fluid Dynamic software). bench_mpi.cxx only do a
usleep(microseconds) then call MPI_Allreduce. microseconds is a constant base
time, divised by mpi_size (+ a random overhead between 0% and 5%, so that
MPI_Allreduce is not called at the same wall clock on all proceses, but I
thing a should have use a different seed on each proc).
So because what the code do is only usleep(base_time / mpi_size), I expect the
wall clock time to be half with twice processor.
With MPiCH 3.1.4, the wall clock time increase with 7 or more processes.
MPI_Allreduce become very slow without a reason. I'm triying to understand
why.
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list