<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:"Courier New";}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<pre><span style="font-family:"Calibri","sans-serif";color:black">Hi Pavan,<o:p></o:p></span></pre>
<pre><span style="font-family:"Calibri","sans-serif";color:black"><o:p> </o:p></span></pre>
<pre><span style="font-family:"Calibri","sans-serif";color:black">Just wondering with the current release whether we have any way to notify the server that client is terminated unexpectedly!<o:p></o:p></span></pre>
<pre><span style="font-family:"Calibri","sans-serif";color:black">Another point: When do we expect to have MPI-4 release out?<o:p></o:p></span></pre>
<pre><span style="font-family:"Calibri","sans-serif";color:black"><o:p> </o:p></span></pre>
<pre><span style="font-family:"Calibri","sans-serif";color:black">Thanks,<o:p></o:p></span></pre>
<pre><span style="font-family:"Calibri","sans-serif";color:black">Hirak<o:p></o:p></span></pre>
<div style="mso-element:para-border-div;border:none;border-bottom:solid windowtext 1.0pt;padding:0in 0in 1.0pt 0in">
<pre style="border:none;padding:0in"><span style="color:black"><o:p> </o:p></span></pre>
</div>
<pre><span style="color:black"><o:p> </o:p></span></pre>
<pre><span style="color:black">Please don’t rely on this feature. We are preparing for MPI-4 Fault Tolerance and are in the process of reworking a bunch of this stuff. This might or might not exist in the future if you are planning to use this for production code.<o:p></o:p></span></pre>
<pre><span style="color:black"><o:p> </o:p></span></pre>
<pre><span style="color:black"> — Pavan<o:p></o:p></span></pre>
<pre><span style="color:black"><o:p> </o:p></span></pre>
<pre><span style="color:black">On Oct 9, 2014, at 10:57 AM, Roy, Hirak <<a href="https://lists.mpich.org/mailman/listinfo/discuss">Hirak_Roy at mentor.com</a>> wrote:<o:p></o:p></span></pre>
<pre><span style="color:black"><o:p> </o:p></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> Hi Sangmin,<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> The readme of mpich says the following :<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> FAILURE NOTIFICATION: THIS IS AN UNSUPPORTED FEATURE AND WILL<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> ALMOST CERTAINLY CHANGE IN THE FUTURE!<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> In the current release, hydra notifies the MPICH library of failed<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> processes by sending a SIGUSR1 signal. The application can catch<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> this signal to be notified of failed processes. If the application<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> replaces the library's signal handler with its own, the application<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> must be sure to call the library's handler from it's own<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> handler. Note that you cannot call any MPI function from inside a<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> signal handler.<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> If this is true, should not I expect SIGUSR1?<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> Thanks,<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> Hirak<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> First of all, MPI functions are not signal safe. So, if you try to use signals within your MPI program, things might break.<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> — Sangmin<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> On Oct 9, 2014, at 7:37 AM, Roy, Hirak <Hirak_Roy at mentor.com<mailto:Hirak_Roy at mentor.com>> wrote:<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> Hi ,<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> I have two MPI processes (server and client) launched independently by two different mpiexec command. (mpich-3.0.4, sock-device)<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> 1> mpiexec –disable-auto-cleanup –n 1 ./server<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> 2> mpiexec –disable-auto-cleanup –n 1 ./client<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> The server opens a port and does MPI_Comm_accept.<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> The client gets the port information and does MPI_Comm_connect and hence we get a new intercommunicator.<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> I don’t do MPI_Comm_merge.<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> I have installed my own signal handler for SIGUSR1 before even I call MPI_Init ( I guess, this will automatically chain the signal handler).<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> >> signal (SIGUSR1, mysignalhandler);<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> Now suppose, the ‘client’ process gets killed ( I forcefully kill the process by signal 9), I thought I would get SIGUSR1 in the process ‘server’.<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> However, I don’t get any signal in ‘server’ process.<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> Am I doing something wrong?<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> I have noticed that if I start 4 client processes with single mpiexec command, and one client gets killed, rest of the 3 clients receive SIGUSR1.<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> Does this mean, SIGUSR1 is not forwarded across processes connected using inter-communicator?<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> <o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> Thanks,<o:p></o:p></i></span></pre>
<pre><span style="color:black">><i> Hirak</i><o:p></o:p></span></pre>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</body>
</html>