<div dir="ltr">Good advice.<div>Now I find out that the performance drops to sequential again when I move all the sources of MPI_Put into threads in one MPI rank even using MPI_Win_allocate.</div><div>Do you have similar experience before?</div><div><br clear="all"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Thanks<div>Kun</div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Sat, Dec 29, 2018 at 9:36 PM Jeff Hammond <<a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div dir="auto">I don’t know why you are timing win_allocate. I’d only time lock-put-unlock or put-flush. </div></div><div dir="auto"><br></div><div dir="auto">Jeff</div><div><br><div class="gmail_quote"><div dir="ltr">On Sat, Dec 29, 2018 at 9:11 AM Kun Feng <<a href="mailto:kfeng1@hawk.iit.edu" target="_blank">kfeng1@hawk.iit.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Thank you for the replies. <div>MPI_Win_allocate gives me much better performance. It is even faster than what I got from pure memory bandwidth test.</div><div>I'm putting the same memory block from the source rank to the same memory address on the destination rank followed by MPI_Win_flush to synchronize.</div><div>Do I do it correctly? The source code is attached.</div><div><br clear="all"><div><div dir="ltr" class="gmail-m_-3837287681581863645m_-5416205269617513052gmail_signature"><div dir="ltr">Thanks</div></div></div></div></div><div dir="ltr"><div><div><div dir="ltr" class="gmail-m_-3837287681581863645m_-5416205269617513052gmail_signature"><div dir="ltr"><div>Kun</div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, Dec 21, 2018 at 11:15 AM Jeff Hammond <<a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Use MPI_Win_allocate instead of MPI_Win_create.  MPI_Win_create cannot allocate shared memory so you will not get good performance within a node.</div><div><br></div><div>Jeff</div><br><div class="gmail_quote"><div dir="ltr">On Fri, Dec 21, 2018 at 8:18 AM Kun Feng via discuss <<a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi all,<div><br></div><div>I'm working on a project in which one half of the processes need to send data to the other half in each node.</div><div>I'm using passive target mode of one-sided communication in which the receivers expose memory using MPI_Win_create, wait on MPI_Win_free and the senders send the data using MPI_Put.</div><div>The code works. However, I get weird performance using this concurrent MPI_Put communication. The peak aggregate bandwidth is only around 5GB/s. It does not make sense as an aggregate performance in one single node.<br></div><div>I thought the node-local communication is implemented as local memcpy.</div><div>But concurrent memcpy on the same testbed has 4x to 5x higher aggregate bandwidth.</div><div>Even concurrent memcpy using Linux shared memory across processes is 3x faster than my code.</div><div>I'm using CH3 in MPICH 3.2.1. CH4 in MPICH 3.3 is even 2x slower.</div><div>Does the performance make sense? Does MPICH has some queue for all one-sided communication in one node? Or do I understand it incorrectly?</div><div><br clear="all"><div><div dir="ltr" class="gmail-m_-3837287681581863645m_-5416205269617513052gmail-m_-3839866249653342293gmail-m_-4839675696708010966gmail-m_-7134994236507619519gmail_signature"><div dir="ltr">Thanks<div>Kun</div></div></div></div></div></div>

_______________________________________________<br>

discuss mailing list     <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a><br>

To manage subscription options or unsubscribe:<br>

<a href="https://lists.mpich.org/mailman/listinfo/discuss" rel="noreferrer" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>

</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail-m_-3837287681581863645m_-5416205269617513052gmail-m_-3839866249653342293gmail_signature">Jeff Hammond<br><a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a><br><a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a></div></div>

</blockquote></div>

</blockquote></div></div>-- <br><div dir="ltr" class="gmail-m_-3837287681581863645gmail_signature">Jeff Hammond<br><a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a><br><a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a></div>

</blockquote></div>