<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    I'm currently working on some performance numbers of MPICH 3.1.3 on
    the Tilera TILE-Gx platform. While direct support isn't indicated, I
    wanted to see what the performance profile would be for some of the
    core MPI functions. The barrier times, however, exhibit poor
    performance and high variance as the number of PEs scale up on this
    36-core SMP device.<br>
    <div class="moz-forward-container"> <br>
      <blockquote><tt>PEs    Trial 1     Trial 2</tt><tt> (times in
          microseconds)<br>
        </tt><tt>2         7.28        7.54</tt><tt><br>
        </tt><tt>3        11.63    13010.92</tt><tt><br>
        </tt><tt>4        15.01      268</tt><tt><br>
        </tt><tt>5     21073.14       18.11</tt><tt><br>
        </tt><tt>6        19.63       18.97</tt><tt><br>
        </tt><tt>7        17.83    22698.7</tt><tt><br>
        </tt><tt>8        23.82     4099.2</tt><tt><br>
        </tt><tt>9        24.72     9334.79</tt><tt><br>
        </tt><tt>10    39859.54     1591.71</tt><tt><br>
        </tt><tt>11    39586.55    38599.86</tt><tt><br>
        </tt><tt>12    30229.56    39799.5</tt><tt><br>
        </tt><tt>13    28860.67     5684.71</tt><tt><br>
        </tt><tt>14    38742.38    39956.55</tt><tt><br>
        </tt><tt>15    43117.49    39952.83</tt><tt><br>
        </tt><tt>16    26762.15    26230.78</tt><tt><br>
        </tt><tt>17    39458.78    59911.38</tt><tt><br>
        </tt><tt>18    46232.86    45327.22</tt><tt><br>
        </tt><tt>19    45462.67    54762.76</tt><tt><br>
        </tt><tt>20    59744.57    39657.49</tt><tt><br>
        </tt><tt>21    54023.2     72209.07</tt><tt><br>
        </tt><tt>22    78213.27    49108.57</tt><tt><br>
        </tt><tt>23    69221.4     79534.28</tt><tt><br>
        </tt><tt>24    71599.46    81920.28</tt><tt><br>
        </tt><tt>25    45443.61    76059.46</tt><tt><br>
        </tt><tt>26    50649.56    72402.17</tt><tt><br>
        </tt><tt>27    39936.37    66906.83</tt><tt><br>
        </tt><tt>28    64324.64    44803.16</tt><tt><br>
        </tt><tt>29    59913.22    53071.9</tt><tt><br>
        </tt><tt>30    44569.65    70844.15</tt><tt><br>
        </tt><tt>31    55428.5     47399.51</tt><tt><br>
        </tt><tt>32    54708.82    64102.45</tt><tt><br>
        </tt><tt>33    34413.99    52384.35</tt><tt><br>
        </tt><tt>34    66390.67    97134.79</tt><tt><br>
        </tt><tt>35    68839.46    62719.52</tt><tt><br>
        </tt><tt>36    43243.98    59132.83</tt><tt> (59.1 ms)<br>
        </tt></blockquote>
      Each entry is an average of 100 iterations. Benchmark used is from
      OSU MPI microbenchmarks (osu_barrier.c).<br>
      <br>
      These are the configure options I used to cross-compile and build
      MPICH for TILE-Gx:<br>
      <blockquote><tt>mkdir build && cd build</tt><br>
        <tt>PREFIXROOT=/opt/tilegx</tt><br>
        <tt>../configure CC=tile-gcc CXX=tile-g++ FC=tile-gfortran
          F77=tile-gfortran \</tt><br>
        <tt>    --prefix=$PREFIXROOT/mpich --host=tile \</tt><br>
        <tt>    --disable-fortran \</tt><br>
        <tt>    --enable-fast=yes --with-thread-package=posix</tt><br>
      </blockquote>
      <br>
      Are there any build options that I'm not using that I should be?
      I'm trying to focus this build specifically for SMP performance.<br>
      <br>
      Thanks,<br>
      <br>
      Bryant Lam<br>
      <br>
      <br>
      <br>
      <br>
    </div>
    <br>
  </body>
</html>