<div dir="ltr">Jeff:<div><br></div><div style>I should have mentioned that the codes I am using for testing the performance are the latency tests that are at <a href="http://www.mcs.anl.gov/~thakur/thread-tests/">http://www.mcs.anl.gov/~thakur/thread-tests/</a> and not my AMR codes per se.</div>

<div style><br></div><div style>Bobby</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Jul 2, 2013 at 2:41 AM, Jeff Hammond <span dir="ltr"><<a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I have done my own studies of this but only in detail on Blue Gene/Q,<br>

which supports both fine-grain and fat locking to support<br>

MPI_THREAD_MULTIPLE.<br>

<br>

I believe that CrayMPI takes a noticeable hit in MPI_THREAD_MULTIPLE<br>

because their network hardware is very low latency and the software<br>

overhead associated with locking (which is fat in their case AFAIK) is<br>

noticeable by comparison.  Is that the vendor in question?<br>

<br>

Do you have the option to aggregate communication and or otherwise use<br>

MPI_THREAD_SERIALIZED instead?  If not, then there really isn't an<br>

alternative so comparative study will only make you a sad panda.<br>

However, if you can use MPI_THREAD_SERIALIZED, perhaps with some<br>

overhead, then you can compare the two implementations.<br>

<br>

It would be helpful if you could share code and system details.<br>

<br>

Best,<br>

<br>

Jeff<br>

<div><div class="h5"><br>

On Mon, Jul 1, 2013 at 2:05 AM, Bobby Philip<br>

<<a href="mailto:bphilip.kondekeril@gmail.com">bphilip.kondekeril@gmail.com</a>> wrote:<br>

> Hi:<br>

><br>

> Are there any studies that have been done on the effect of turning on<br>

> MPI_THREAD_MULTIPLE with the latest versions of MPICH? I have an AMR<br>

> application where halo or ghost updates require lots of small messages to be<br>

> sent/rec'd and I am currently seeing a performance hit with a vendor<br>

> specific implementation based on MPICH2 and am trying to see whether there<br>

> are any implementations out there that might deliver better performance.<br>

><br>

> Thanks,<br>

> Bobby<br>

><br>

</div></div>> _______________________________________________<br>

> discuss mailing list     <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>

> To manage subscription options or unsubscribe:<br>

> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>

<span class="HOEnZb"><font color="#888888"><br>

<br>

<br>

--<br>

Jeff Hammond<br>

<a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a><br>

_______________________________________________<br>

discuss mailing list     <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>

To manage subscription options or unsubscribe:<br>

<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>

</font></span></blockquote></div><br></div>