<div dir="ltr">You may consider AMPI (<a href="http://charm.cs.uiuc.edu/ppl_research/ampi/">http://charm.cs.uiuc.edu/ppl_research/ampi/</a>), which supports <i>automatic</i> dynamic load balancing, a very cool idea.</div><div class="gmail_extra">

<br clear="all"><div><div dir="ltr">--Junchao Zhang</div></div>

<br><br><div class="gmail_quote">On Thu, Jun 5, 2014 at 8:51 PM, Ron Palmer <span dir="ltr"><<a href="mailto:ron.palmer@pgcgroup.com.au" target="_blank">ron.palmer@pgcgroup.com.au</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000">

    Pavan,<br>

    thanks for your reply and comments. Unfortunately, the actual

    application I am running is outside my sphere of influence (though

    as I am running a beta code, I will forward your thoughts in hope

    they will consider them).<br>

    <br>

    Is there another way that does not include looking at the actual

    application doing the work?<br>

    <br>

    Regards,<br>

    Ron<div><div class="h5"><br>

    <br>

    <div>On 6/06/2014 11:48, Balaji, Pavan

      wrote:<br>

    </div>

    <blockquote type="cite">

      <pre>Ron,

In general, oversubscribing the cores of a node is a bad idea.  MPI is optimized for the common case where each MPI process is on at least one core, which most applications use.  This, however, adds a cost when you oversubscribe, and is not recommended.

To deal with cores that operate at different speeds, the only good way is to restructure your algorithm to be more asynchronous in nature.  For example, if a master-worker model is possible, that might work great.  Some workers (which are running on faster cores) do more work than others.  However, not all algorithms can be expressed in this model.  There are other asynchronous models possible too.

In short, I think it’s time to go back to the whiteboard and see if the algorithm used by the application is appropriate or not.

  — Pavan

On Jun 5, 2014, at 7:42 PM, Ron Palmer <a href="mailto:ron.palmer@pgcgroup.com.au" target="_blank"><ron.palmer@pgcgroup.com.au></a> wrote:

</pre>

      <blockquote type="cite">

        <pre>I have a small cluster of computers with uneven clock speed CPUs and currently I am running with "-np" == total number of cores. However, it appears as if the fastest computer has to wait for the slower ones to finish at the end (at least I believe so). The most recent process took 65 hours so I am interested in finding ways to optimise the process.

Is it possible to, say, use a larger "-np" and then increase the thread number for the faster CPUs in the machine file to make the faster computers do more work so, ideally, they all finish about the same time? Will it finish off the first batch then start on the next batch? Or, will the faster computers just get more concurrent jobs, possibly slowing down the processing?

eg, if the single CPU of PC_A has twice the clock rating to that of single CPU PC_B, and both has quad cores, then use -np=12 and then have the following in the machinefile:

PC_A:8

PC_B:4

Perhaps this is something better addressed with job scheduling software like GridEngine? Reuti?

Thanks,

Ron

-- 

Ron Palmer MSc MBA.

Principal Geophysicist

<a href="mailto:ron.palmer@pgcgroup.com.au" target="_blank">ron.palmer@pgcgroup.com.au</a>

0413 579 099

07 3103 4963

_______________________________________________

discuss mailing list     <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a>

To manage subscription options or unsubscribe:

<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a>

</pre>

      </blockquote>

      <pre>_______________________________________________

discuss mailing list     <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a>

To manage subscription options or unsubscribe:

<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a>

</pre>

    </blockquote>

    <br>

    <div>-- <br>

      <p style="margin-bottom:0cm;line-height:100%"><font color="#0000a2"><font face="Times New Roman"><font size="3"><span lang="en"><b>Ron

                  Palmer</b></span></font></font></font><font color="#000000"> </font><font color="#000000"><font><span lang="en">MSc

              MBA</span></font></font><font color="#000000"><span lang="en">. </span></font>

      </p>

      <p style="margin-bottom:0cm;line-height:100%" lang="en">

        <font color="#000000"><font face="Times New Roman"><font size="3">Principal

              Geophysicist</font></font></font></p>

      <p style="margin-bottom:0cm;line-height:100%"><a href="mailto:ron.palmer@pgcgroup.com.au" target="_blank"><font color="#0000a2"><font face="Times New Roman"><font size="3"><span lang="en">ron.palmer@pgcgroup.com.au</span></font></font></font></a></p>

      <p style="margin-bottom:0cm;line-height:100%" lang="en">

        <font color="#000000"><font face="Times New Roman"><font size="3">0413

              579 099</font></font></font></p>

      <p style="line-height:100%" lang="en"><font color="#000000"><font face="Times New Roman"><font size="3">07

              3103 4963</font></font></font></p>

      <p style="margin-bottom:0cm"><br>

      </p>

    </div>

  </div></div></div>

<br>_______________________________________________<br>

discuss mailing list     <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>

To manage subscription options or unsubscribe:<br>

<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br></blockquote></div><br></div>