<meta http-equiv="Content-Type" content="text/html; charset=utf-8"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Mar 2, 2015 at 9:01 PM, Roy, Hirak <span dir="ltr"><<a href="mailto:Hirak_Roy@mentor.com" target="_blank">Hirak_Roy@mentor.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-US" link="blue" vlink="purple">
<div>
<p class="MsoNormal">Hi Wesley,<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">As I mentioned in my email, MPI_disconnect hangs.<u></u><u></u></p>
<p class="MsoNormal">Here is a short program which you can run.<u></u><u></u></p>
<p class="MsoNormal">Please note that there is an “assert” in client.c<u></u><u></u></p>
<p class="MsoNormal">Compile : <u></u><u></u></p>
<p class="MsoNormal">>> mpicc server.c –o server<u></u><u></u></p>
<p class="MsoNormal">>> mpicc client.c –o client<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">To run use two shell/terminal :<u></u><u></u></p>
<p class="MsoNormal">Term1>> mpiexec –n 1 ./server<u></u><u></u></p>
<p class="MsoNormal">Term2>> mpiexec –n 1 ./client<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Please press any key on the server terminal.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">I earlier filed a bug related to this : <a href="http://trac.mpich.org/projects/mpich/ticket/2205" target="_blank">
http://trac.mpich.org/projects/mpich/ticket/2205</a><u></u><u></u></p>
<p class="MsoNormal">My question :<u></u><u></u></p>
<p><u></u><span>1><span style="font:7.0pt "Times New Roman"">
</span></span><u></u>What happens if we don’t call MPI_Finalize and call exit(0)?</p></div></div></blockquote><div>Mostly nothing. In theory, some things might not get cleaned up, but probably the worst you'll see is a nasty error message. <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div><p><u></u><u></u></p>
<p><u></u><span>2><span style="font:7.0pt "Times New Roman"">
</span></span><u></u>Is there anyway I can forcefully complete MPI_disconnect from server side ?</p></div></div></blockquote><div>I don't know of anything that will allow you to do that. <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple"><div><p><u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Thanks,<u></u><u></u></p>
<p class="MsoNormal">Hirak<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Ps : The reason of using sock is : bug in nemesis : <a href="http://trac.mpich.org/projects/mpich/ticket/1103" target="_blank">
http://trac.mpich.org/projects/mpich/ticket/1103</a> and <a href="http://trac.mpich.org/projects/mpich/ticket/79" target="_blank">
http://trac.mpich.org/projects/mpich/ticket/79</a><u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><b><span style="font-size:13.5pt;color:black">Wesley Bland</span></b><span><span style="font-size:13.5pt;color:black;background:white"> </span></span><a href="mailto:discuss%40mpich.org?Subject=Re%3A%20%5Bmpich-discuss%5D%20MPI_Finalize%20hangs%20in%20dynamic%20connection%20in%0A%09case%20of%20failed%20process&In-Reply-To=%3CF4B1C63E-062B-4036-9DE5-A4C93096F32C%40anl.gov%3E" title="[mpich-discuss] MPI_Finalize hangs in dynamic connection in case of failed process" target="_blank"><span style="font-size:13.5pt">wbland
at anl.gov</span><span><span style="font-size:13.5pt;color:blue;text-decoration:none"> </span></span></a><span style="font-size:13.5pt;color:black"><br>
<i>Thu Feb 26 10:13:42 CST 2015</i></span><u></u><u></u></p>
<ul type="disc">
<li class="MsoNormal" style="color:black">
<span style="font-size:13.5pt">Previous message:<span> </span><a href="http://lists.mpich.org/pipermail/discuss/2015-February/003725.html" target="_blank">[mpich-discuss] MPI_Finalize hangs in dynamic connection in case of failed process</a><u></u><u></u></span></li><li class="MsoNormal" style="color:black">
<span style="font-size:13.5pt">Next message:<span> </span><a href="http://lists.mpich.org/pipermail/discuss/2015-February/003726.html" target="_blank">[mpich-discuss] query ABI version</a><u></u><u></u></span></li><li class="MsoNormal" style="color:black">
<b><span style="font-size:13.5pt">Messages sorted by:</span></b><span><span style="font-size:13.5pt"> </span></span><span style="font-size:13.5pt"><a href="http://lists.mpich.org/pipermail/discuss/2015-February/date.html#3740" target="_blank">[
date ]</a><span> </span><a href="http://lists.mpich.org/pipermail/discuss/2015-February/thread.html#3740" target="_blank">[ thread ]</a><span> </span><a href="http://lists.mpich.org/pipermail/discuss/2015-February/subject.html#3740" target="_blank">[
subject ]</a><span> </span><a href="http://lists.mpich.org/pipermail/discuss/2015-February/author.html#3740" target="_blank">[ author ]</a><u></u><u></u></span></li></ul>
<div class="MsoNormal" align="center" style="text-align:center">
<hr size="3" width="100%" noshade="" style="color:black" align="center">
</div><span class="">
<pre style="white-space:pre-wrap;text-align:start;word-spacing:0px"><span style="color:black">First, I believe the sock device is untested with most of the MPICH fault tolerance features, so YMMV here.<u></u><u></u></span></pre>
<pre><span style="color:black"><u></u> <u></u></span></pre>
<pre><span style="color:black">Is there a reason that you aren’t calling MPI_Disconnect for the failed process? Did you try it an something bad happened? That seems like the most straightforward way of doing things.<u></u><u></u></span></pre>
<pre><span style="color:black"><u></u> <u></u></span></pre>
<pre><span style="color:black">Otherwise, this sounds like a known issue that we’re seeing from time to time with MPI_Finalize and the FT work. It’s something I’m trying to figure out now. If you can reduce your code down to the minimum and send it to me, I can use it as a test case to try to fix the problem.<u></u><u></u></span></pre>
<pre><span style="color:black"><u></u> <u></u></span></pre>
<pre><span style="color:black">Thanks,<u></u><u></u></span></pre>
<pre><span style="color:black">Wesley<u></u><u></u></span></pre>
<pre><span style="color:black"><u></u> <u></u></span></pre>
</span><span class=""><pre><span style="color:black">><i> On Feb 19, 2015, at 5:15 AM, Roy, Hirak <<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">Hirak_Roy at mentor.com</a>> wrote:<u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> <u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> Hi All,<u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> <u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> I am using MPICH with sock connection.<u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> I also setup processes using dynamic connection method (MPI_Comm_connect/MPI_Comm_accept). It’s a master-slave architecture where master accepts the connections from slaves.<u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> <u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> Now if one of the process dies (or get killed), I can still recover from this (without using checkpoint/restore method).<u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> For the particular process in master, I do not call MPI_disconnect (it hangs and does not complete).<u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> As a result, my MPI_Finalize in master hangs and does not complete.<u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> Do you have a workaround to forcefully complete MPI_Finalize or MPI_disconnect?<u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> I tried MPI_Comm_free on the failed connection. However, it does not solve the hang in finalize.<u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> <u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> Thanks,<u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> Hirak<u></u><u></u></i></span></pre>
<pre><span style="color:black">><i> _______________________________________________</i><u></u><u></u></span></pre>
<p class="MsoNormal"><u></u> <u></u></p>
</span></div>
</div>
</blockquote></div><br></div></div>