<meta http-equiv="Content-Type" content="text/html; charset=utf-8"><div dir="ltr">Hi Martin,<div><br></div><div>Thank you for reporting your use of MPICH to us. </div><div><br></div><div>If you could send us a simple test case that reproduce the crash in your workload, then we can use it to fix our code and make it better.</div><div><br></div><div>For the question of 3.2b2, the answer is yes! We have been improving MXM since 3.1.4 (thanks to the contribution of Mellanox). It will be worth trying the new release to see if fixes the crash in your workload.</div><div><br></div><div>Thanks,</div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div dir="ltr">--</div><div dir="ltr">Huiwei Lu</div><div dir="ltr">Postdoc Appointee</div><div dir="ltr">Mathematics and Computer Science Division</div><div dir="ltr">Argonne National Laboratory</div><div dir="ltr"><a href="http://www.mcs.anl.gov/~huiweilu/" target="_blank">http://www.mcs.anl.gov/~huiweilu/</a></div></div></div></div></div></div></div></div>
<br><div class="gmail_quote">On Fri, Apr 24, 2015 at 4:38 PM, Martin Cuma <span dir="ltr"><<a href="mailto:martin.cuma@utah.edu" target="_blank">martin.cuma@utah.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello,<br>
<br>
I am getting errors with building -with-device=ch3:nemesis:ib, using GNU, Intel or PGI compilers, as:<br>
<br>
I am wondering what's going on since this since Intel and GNU worked in version 3.1.2. I configure as:<br>
../../../srcdir/mpich/3.1.4/configure --prefix=/uufs/<a href="http://chpc.utah.edu/sys/installdir/mpich/3.1.4" target="_blank">chpc.utah.edu/sys/installdir/mpich/3.1.4</a> --enable-romio --with-file-system=nfs+ufs --with-mpe -with-device=ch3:nemesis:ib --enable-threads=runtime --enable-fast=all<br>
<br>
and use RHEL6.6 with stock gcc 4.4.7.<br>
<br>
When I reported a similar issue earlier, Pavan suggested to use MXM - I tried that and that seems to work, however, perhaps since we run relatively old OFED (stock RHEL6-like), which does not come with MXM, I used the one from the latest Mellanox HPC-X, and, I am not sure if that's the best idea since I see crashes related to communication at certain workloads - which I don't see with other MPIs or when using the ib netmod in MPICH.<br>
<br>
Would you please also mind commenting on this? Would you expect the just released 3.2b2 fare better with MXM than the 3.1.4?<br>
<br>
Thanks,<br>
MC<span class="HOEnZb"><font color="#888888"><br>
<br>
-- <br>
Martin Cuma<br>
Center for High Performance Computing<br>
Department of Geology and Geophysics<br>
University of Utah<br>
_______________________________________________<br>
discuss mailing list <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
</font></span></blockquote></div><br></div></div>