<meta http-equiv="Content-Type" content="text/html; charset=utf-8"><div dir="ltr"><div><div>Jeff, thanks a lot for your detailed answer. It makes much more sense now. <br><br></div>Best regards,<br></div>Khalid<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar 3, 2015 at 4:11 AM, Jeff Hammond <span dir="ltr"><<a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Those operations were never specially optimized on Blue Gene.  The<br>


operations that were optimized heavily on Blue Gene were/are Barrier,<br>


Bcast, Allreduce, and Alltoall(v).  Most of the other collective<br>


optimizations were derivatives of that.  For example, one can<br>


implement Allgather as a series of Bcasts.  And Reduce was optimized<br>


as a side effect of Allreduce; in some cases, Allreduce was faster<br>


than Reduce, which is counter-intuitive.<br>


<br>


One optimization that Blue Gene had for Gather was to use Reduce with<br>


BOR (on BG/P) and SUM (on BG/Q, for float types at least).  This<br>


turned Gather of count=1 into Reduce of count=nproc (size of<br>


communicator), but it was very faster for short messages.<br>


<br>


There aren't many good optimizations for Scatter.  MPICH has most of<br>


the generic ones, as you might expect.  It is possible that Scatter as<br>


Alltoallv with only one non-zero in the count vector was faster than<br>


MPICH, but this would surprise me.<br>


<br>


In any case, all MPI collectives on BG benefit from very good<br>


network-to-processor balance, minimal rendezvous (connectionless HW<br>


and SW), trivial virtual-to-physical translation, good bisection<br>


bandwidth of _electrically isolated_ torus networks, etc.<br>


<br>


If you have a more specific question, I might have a better answer.<br>


<br>


Best,<br>


<br>


Jeff<br>


<div class="HOEnZb"><div class="h5"><br>


On Sun, Mar 1, 2015 at 4:36 PM, Khalid Hasanov <<a href="mailto:xalid.h@gmail.com">xalid.h@gmail.com</a>> wrote:<br>


> Hello,<br>


><br>


> First of all, I am not sure if this group is the right place for this<br>


> question. If not I apologize for asking unrelated question.<br>


><br>


> I read two papers about optimizing MPI collective communications on BG/L and<br>


> BG/P.<br>


> (Optimization of MPI Collective Communication on BlueGene/L Systems<br>


> and MPI Collective Communications on The Blue Gene/P Supercomputer:<br>


> Algorithms and Optimizations respectively). However, these two papers do not<br>


> mention anything about MPI scatter and gather operations, I wonder if these<br>


> two collective operations have been optimized for BlueGene or they use<br>


> exactly the same algorithms from MPICH. Any reference appreciated.<br>


><br>


><br>


> Best regards,<br>


> Khalid<br>


><br>


</div></div><span class="im HOEnZb">> _______________________________________________<br>


> discuss mailing list     <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>


> To manage subscription options or unsubscribe:<br>


> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>


<br>


<br>


<br>


--<br>


</span><span class="HOEnZb"><font color="#888888">Jeff Hammond<br>


<a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a><br>


<a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a><br>


</font></span><div class="HOEnZb"><div class="h5">_______________________________________________<br>


discuss mailing list     <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>


To manage subscription options or unsubscribe:<br>


<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>


</div></div></blockquote></div><br></div>