<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    Hi Abu,<br>
    <br>
    I think the results are stable enough. Perhaps you could also try
    the following tests, and see if similar trend exists:<br>
    - MPICH/socket (set `--with-device=ch3:sock` at configure)<br>
    - A socket-based pingpong test without MPI. <br>
    <br>
    At this point, I could not think of any MPI-specific design for
    2k/8k messages. My guess is that it is related to your network
    connection. <br>
    <br>
    Min<br>
    <br>
    <div class="moz-cite-prefix">On 2018/06/24 11:09, Abu Naser wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:BLUPR0501MB20037286A10BAAD88A18EF8D974B0@BLUPR0501MB2003.namprd05.prod.outlook.com">
      <meta http-equiv="Context-Type" content="text/html;
        charset=Windows-1252">
      <div id="divtagdefaultwrapper" dir="ltr">
        <div id="divtagdefaultwrapper" dir="ltr">
          <p>Hello Min and Jeff,</p>
          <p><br>
          </p>
          <p>Here is my experiment results. Default number of iterations
            in osu_latency for 0B – 8KB is 10,000. With that setting I
            had run the osu_latency 100 times and found standard
            deviation 33 for 8KB message size.</p>
          <p><br>
          </p>
          <p>So later I have set the iteration to 50,000 and 100,000 for
            1KB – 16KB message size. Then run osu_latency for 100 times
            for each setting and take the average and standard
            deviation.</p>
          <p><br>
          </p>
          <table width="665">
            <colgroup><col width="99"><col width="112"><col width="118"><col width="154"><col width="140"></colgroup>
            <tbody>
              <tr>
                <td width="99">
                  <p><b>Msg Size in Bytes</b></p>
                </td>
                <td width="112">
                  <p><b>Avg time in us (50K iterations)</b></p>
                </td>
                <td width="118">
                  <p><b>Avg time in us (100k iterations)</b></p>
                </td>
                <td width="154">
                  <p><b>Standard deviation (50K iterations)</b></p>
                </td>
                <td width="140">
                  <p><b>Standard deviation (100K iterations)</b></p>
                </td>
              </tr>
              <tr>
                <td width="99">
                  <p>1k</p>
                </td>
                <td width="112">
                  <p>85.10</p>
                </td>
                <td width="118">
                  <p>84.9</p>
                </td>
                <td width="154">
                  <p>0.55</p>
                </td>
                <td width="140">
                  <p>0.45</p>
                </td>
              </tr>
              <tr>
                <td width="99">
                  <p>2k</p>
                </td>
                <td width="112">
                  <p>75.79</p>
                </td>
                <td width="118">
                  <p>74.63</p>
                </td>
                <td width="154">
                  <p>5.09</p>
                </td>
                <td width="140">
                  <p>4.44</p>
                </td>
              </tr>
              <tr>
                <td width="99">
                  <p>4k</p>
                </td>
                <td width="112">
                  <p>273.80</p>
                </td>
                <td width="118">
                  <p>274.71</p>
                </td>
                <td width="154">
                  <p>4.18</p>
                </td>
                <td width="140">
                  <p>2.45</p>
                </td>
              </tr>
              <tr>
                <td width="99">
                  <p>8k</p>
                </td>
                <td width="112">
                  <p>258.56</p>
                </td>
                <td width="118">
                  <p>249.83</p>
                </td>
                <td width="154">
                  <p>21.14</p>
                </td>
                <td width="140">
                  <p>28</p>
                </td>
              </tr>
              <tr>
                <td height="24" width="99">
                  <p>16k</p>
                </td>
                <td width="112">
                  <p>281.31</p>
                </td>
                <td width="118">
                  <p>281.02</p>
                </td>
                <td width="154">
                  <p>3.22</p>
                </td>
                <td width="140">
                  <p>4.10</p>
                </td>
              </tr>
            </tbody>
          </table>
          <p><br>
          </p>
          <p><br>
          </p>
          <p>The standard deviation of 8K message is so high and that
            implies it actually not producing any consistent latency
            time. Looks like that's the reason for 8K is taking less
            time than 4K.</p>
          <p><br>
          </p>
          <p>Meanwhile, 2K has standard deviation less than 5 but 1K
            message latency timing are more densely populated than 2K.
            So probably this is the explanation for 2K message less
            latency time.</p>
          <p><br>
          </p>
          <p>Thank you for your suggestions.</p>
          <br>
          <p><br>
          </p>
          <div id="Signature">
            <div id="divtagdefaultwrapper" dir="ltr">
              <p><br>
              </p>
              <p><span>Best Regards,</span></p>
              <span></span>
              <div><span></span></div>
              <span></span>
              <p><span>Abu Naser</span><br>
              </p>
            </div>
          </div>
        </div>
        <hr tabindex="-1">
        <div id="divRplyFwdMsg" dir="ltr"><b>From:</b> Abu Naser<br>
          <b>Sent:</b> Wednesday, June 20, 2018 1:48:53 PM<br>
          <b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
          <b>Subject:</b> Re: [mpich-discuss] osu_latency test: why 8KB
          takes less time than 4KB and 2KB takes less time than 1KB?
          <div> </div>
        </div>
        <meta content="text/html; charset=iso-8859-1">
        <div dir="ltr">
          <div id="x_divtagdefaultwrapper" dir="ltr">
            <div id="x_divtagdefaultwrapper" dir="ltr">
              <p>Hello Min,</p>
              <p><br>
              </p>
              <p>Thanks for the clarification.  I will do the
                experiment.<br>
              </p>
              <p><br>
              </p>
              <div id="x_Signature">
                <div id="x_divtagdefaultwrapper" dir="ltr">
                  <p>Thanks.</p>
                  <p><span>Best Regards,</span></p>
                  <span></span>
                  <div><span></span></div>
                  <span></span>
                  <p><span>Abu Naser</span><br>
                  </p>
                </div>
              </div>
            </div>
            <hr tabindex="-1">
            <div id="x_divRplyFwdMsg" dir="ltr"><b>From:</b> Min Si
              <a class="moz-txt-link-rfc2396E" href="mailto:msi@anl.gov"><msi@anl.gov></a><br>
              <b>Sent:</b> Wednesday, June 20, 2018 1:39:30 PM<br>
              <b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
              <b>Subject:</b> Re: [mpich-discuss] osu_latency test: why
              8KB takes less time than 4KB and 2KB takes less time than
              1KB?
              <div> </div>
            </div>
            <meta content="text/html; charset=Windows-1252">
            <div>Hi Abu,<br>
              <br>
              I think Jeff means that you should run your experiment
              with more iterations in order to get a stable results.<br>
              - Increase the iteration of for loop in each execution (I
              think osu benchmark allows you to set it)<br>
              - Run the experiments 10 or 100 times, and take the
              average and standard deviation.<br>
              <br>
              If you see a very small standard deviation (e.g.,
              <=5%), then the trend is stable and you might not see
              such gaps.<br>
              <br>
              Best regards,<br>
              Min<br>
              <div class="x_x_moz-cite-prefix">On 2018/06/20 12:14, Abu
                Naser wrote:<br>
              </div>
              <blockquote type="cite">
                <div id="x_x_divtagdefaultwrapper" dir="ltr">
                  <p>Hello Jeff,</p>
                  <p><br>
                  </p>
                  <p>Yes, I am using a switch and other machines are
                    also connected with that switch.
                    <br>
                  </p>
                  <p>If I remove other machines and just use my two node
                    with the switch, then will it improve the
                    performance by 200 ~ 400 iterations?</p>
                  <p>Meanwhile I will give a try with a single dedicated
                    cable.
                    <span></span><br>
                  </p>
                  <p><br>
                  </p>
                  <p>Thank you.<br>
                  </p>
                  <div id="x_x_Signature">
                    <div id="x_x_divtagdefaultwrapper" dir="ltr">
                      <p><br>
                      </p>
                      <p><span>Best Regards,</span></p>
                      <span></span>
                      <div><span></span></div>
                      <span></span>
                      <p><span>Abu Naser</span><br>
                      </p>
                    </div>
                  </div>
                </div>
                <hr tabindex="-1">
                <div id="x_x_divRplyFwdMsg" dir="ltr"><b>From:</b> Jeff
                  Hammond
                  <a class="x_x_moz-txt-link-rfc2396E x_OWAAutoLink" href="mailto:jeff.science@gmail.com" id="LPlnk983157" moz-do-not-send="true">
                    <jeff.science@gmail.com></a><br>
                  <b>Sent:</b> Wednesday, June 20, 2018 12:52:06 PM<br>
                  <b>To:</b> MPICH<br>
                  <b>Subject:</b> Re: [mpich-discuss] osu_latency test:
                  why 8KB takes less time than 4KB and 2KB takes less
                  time than 1KB?
                  <div> </div>
                </div>
                <meta content="text/html; charset=utf-8">
                <div>
                  <div dir="ltr">Is the ethernet connection a single
                    dedicated cable between the two machines or are you
                    running through a switch that handles other traffic?
                    <div><br>
                    </div>
                    <div>My best guess is that this is noise and that
                      you may be able to avoid it by running a very long
                      time, e.g. 10000 iterations.</div>
                    <div><br>
                    </div>
                    <div>Jeff</div>
                  </div>
                  <div class="x_x_x_gmail_extra"><br>
                    <div class="x_x_x_gmail_quote">On Wed, Jun 20, 2018
                      at 6:53 AM, Abu Naser <span dir="ltr">
                        <<a href="mailto:an16e@my.fsu.edu" target="_blank" id="LPlnk305789" class="x_OWAAutoLink" moz-do-not-send="true">an16e@my.fsu.edu</a>></span>
                      wrote:<br>
                      <blockquote class="x_x_x_gmail_quote">
                        <div dir="ltr">
                          <div id="x_x_x_m_6077755676379859201divtagdefaultwrapper" dir="ltr">
                            <p><br>
                            </p>
                            <p>Good day to all,</p>
                            <p><br>
                            </p>
                            <p>I had run point to point osu_latency test
                              in two nodes for 200 times.  Followings
                              are the average time in microsecond for
                              various size of the messages -</p>
                            <div>1KB    84.8514 us<br>
                              <span>2KB    73.52535</span> us<br>
                              4KB    272.55275 us<br>
                              <span>8KB    234.86385</span> us<br>
                              16KB    288.88 us<br>
                              32KB    523.3725 us<br>
                              64KB    910.4025 us</div>
                            <p><br>
                            </p>
                            <p>From the above looks like, 2KB message
                              has less latency than 1 KB and 8KB has
                              less latency than 4KB.
                              <br>
                            </p>
                            <p>I was looking for explanation of this
                              behavior  but did not get any.</p>
                            <p><br>
                            </p>
                            <ol>
                              <li><span>MPIR_CVAR_CH3_EAGER_MAX_MSG_<wbr>SIZE</span><span>
                                  is set to 128KB. So none of the above
                                  message size is using Rendezvous
                                  protocol. Is there any partition
                                  inside eager protocol (e.g. 0 - 512
                                  bytes, 1KB - 8KB, 16KB - 64KB)? If yes
                                  then what are the boundaries for them?
                                  Can I log them with
                                  debug-event-logging? </span><br>
                              </li>
                            </ol>
                            <p><br>
                            </p>
                            <p>Setup I am using:</p>
                            <p>- two nodes has intel core i7, one with
                              16gb memory another one 8gb</p>
                            <p>- mpich 3.2.1, configured and build to
                              use nemesis tcp</p>
                            <p>- 1gb Ethernet connection</p>
                            <p>- NFS is using for sharing<br>
                            </p>
                            <p>- osu_latency : uses MPI_Send and
                              MPI_Recv</p>
                            <p>- <span>MPIR_CVAR_CH3_EAGER_MAX_MSG_<wbr>SIZE</span>=
                              <span>131072</span> (128KB)<br>
                            </p>
                            <p><br>
                            </p>
                            <p>Can anyone help me on that? Thanks in
                              advance.<br>
                            </p>
                            <p><br>
                            </p>
                            <p><br>
                            </p>
                            <div id="x_x_x_m_6077755676379859201Signature">
                              <div id="x_x_x_m_6077755676379859201divtagdefaultwrapper" dir="ltr">
                                <p><br>
                                </p>
                                <p><span>Best Regards,</span></p>
                                <span></span>
                                <div><span></span></div>
                                <span></span>
                                <p><span>Abu Naser</span><br>
                                </p>
                              </div>
                            </div>
                          </div>
                        </div>
                        <br>
                        ______________________________<wbr>_________________<br>
                        discuss mailing list     <a href="mailto:discuss@mpich.org" id="LPlnk816471" class="x_OWAAutoLink" moz-do-not-send="true">discuss@mpich.org</a><br>
                        To manage subscription options or unsubscribe:<br>
                        <a href="https://lists.mpich.org/mailman/listinfo/discuss" rel="noreferrer" target="_blank" id="LPlnk624595" class="x_OWAAutoLink" moz-do-not-send="true">https://lists.mpich.org/<wbr>mailman/listinfo/discuss</a><br>
                        <br>
                      </blockquote>
                    </div>
                    <br>
                    <br>
                    <div><br>
                    </div>
                    -- <br>
                    <div class="x_x_x_gmail_signature">Jeff Hammond<br>
                      <a href="mailto:jeff.science@gmail.com" target="_blank" id="LPlnk314993" class="x_OWAAutoLink" moz-do-not-send="true">jeff.science@gmail.com</a><br>
                      <a href="http://jeffhammond.github.io/" target="_blank" id="LPlnk861434" class="x_OWAAutoLink" moz-do-not-send="true">http://jeffhammond.github.io/</a></div>
                  </div>
                </div>
                <br>
                <fieldset class="x_x_mimeAttachmentHeader"></fieldset>
                <br>
                <pre>_______________________________________________
discuss mailing list     <a class="x_x_moz-txt-link-abbreviated x_OWAAutoLink" href="mailto:discuss@mpich.org" id="LPlnk657371" moz-do-not-send="true">discuss@mpich.org</a>
To manage subscription options or unsubscribe:
<a class="x_x_moz-txt-link-freetext x_OWAAutoLink" href="https://lists.mpich.org/mailman/listinfo/discuss" id="LPlnk669988" moz-do-not-send="true">https://lists.mpich.org/mailman/listinfo/discuss</a>
</pre>
              </blockquote>
              <br>
            </div>
          </div>
        </div>
      </div>
      <!--'"--><br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
discuss mailing list     <a class="moz-txt-link-abbreviated" href="mailto:discuss@mpich.org">discuss@mpich.org</a>
To manage subscription options or unsubscribe:
<a class="moz-txt-link-freetext" href="https://lists.mpich.org/mailman/listinfo/discuss">https://lists.mpich.org/mailman/listinfo/discuss</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>