<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
Unfortunately, you’re correct that there isn’t currently a solution. Until we fix that ticket, checkpointing is currently not functioning in MPICH. It’s on the roadmap to be fixed along with some new fault tolerance features in the future, but it’s not there
yet.
<div><br>
</div>
<div>Thanks,</div>
<div>Wesley<br>
<div><br>
<div>
<blockquote type="cite">
<div>On Aug 18, 2014, at 8:51 AM, myself <<a href="mailto:chcdlf@126.com">chcdlf@126.com</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div>
<div style="line-height: 1.7; font-size: 14px; font-family: Arial;">
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">I tried to use BLCR with MPICH3. However, it seems not to work. I compile the blcr in CentOS and `make test` show not fail tests. Then, I compile mpich
with BLCR. The information is shown as follows,</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;"><br>
</span></font></div>
<blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;">
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">$ mpichversion </span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">MPICH Version: <span class="Apple-tab-span" style="white-space:pre">
</span>3.1.2</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">MPICH Release date:<span class="Apple-tab-span" style="white-space:pre">
</span>Mon Jul 21 16:00:21 CDT 2014</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">MPICH Device: <span class="Apple-tab-span" style="white-space:pre">
</span>ch3:nemesis</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">MPICH configure:
<span class="Apple-tab-span" style="white-space:pre"></span>--prefix=/home/test/develop/mpich3-blcr --with-device=ch3:nemesis CFLAGS=-fPIC --enable-checkpointing --with-blcr=/home/test/develop/blcr-0.8.5 --with-hydra-ckpointlib=blcr</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">MPICH CC:
<span class="Apple-tab-span" style="white-space:pre"></span>gcc -fPIC -O2</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">MPICH CXX:
<span class="Apple-tab-span" style="white-space:pre"></span>g++ -O2</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">MPICH F77:
<span class="Apple-tab-span" style="white-space:pre"></span>gfortran -O2</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">MPICH FC:
<span class="Apple-tab-span" style="white-space:pre"></span>gfortran -O2</span></font></div>
</blockquote>
<div><font color="#555555" face="Microsoft Yahei, verdana">
<div style="font-size: 12px; line-height: 19px;"><br>
</div>
<div style="font-size: 12px; line-height: 19px;">After that, I compile my application like this</div>
<div style="font-size: 12px; line-height: 19px;"><br>
</div>
</font></div>
<blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;">
<div><font color="#555555" face="Microsoft Yahei, verdana">
<div style="font-size: 12px; line-height: 19px;">$ mpicc mpiblcr.c -o mpiblcr -lcr</div>
<div style="font-size: 12px; line-height: 19px;"><br>
</div>
</font></div>
</blockquote>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">When I firstly run the application, it seems ok to make the checkpoint files, such as context-num0-0-0.</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;"><br>
</span></font></div>
<blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;">
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">$ mpiexec -ckpointlib blcr -ckpoint-prefix `pwd` -ckpoint-interval 2 -n 2 ./mpiblcr</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">5411) Step 0</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">5410) Step 0</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">5410) Step 1</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">5411) Step 1</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">[proxy:0:0@node1] requesting checkpoint</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">[proxy:0:0@node1] checkpoint completed</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">5410) Step 2</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">
<div>5411) Step 2</div>
<div><br>
</div>
</span></font></div>
</blockquote>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">However, when I try to restart the process with checkpoint, it hangs and thereis no information printed.</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;"><br>
</span></font></div>
<blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;">
<div><font color="#555555" face="Microsoft Yahei, verdana">
<div><span style="font-size: 12px; line-height: 19px;">$ mpiexec -ckpointlib blcr -ckpoint-prefix `pwd` -n 2 -ckpoint-num 1</span></div>
<div><br>
</div>
</font></div>
</blockquote>
<div><font color="#555555" face="Microsoft Yahei, verdana">The pstree shows the pmi start application process</font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana"><br>
</font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana">
<div> ├─sshd─┬─3*[sshd───sshd───bash]</div>
<div> │ ├─sshd───sshd───bash───mpiexec───hydra_pmi_proxy───mpiblcr</div>
<div> │ └─sshd───sshd───bash───pstree</div>
<div><br>
</div>
</font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana">
<div style="font-size: 12px; line-height: 19px;">and `ps aux` shows the process is defunct</div>
<div style="font-size: 12px; line-height: 19px;"><br>
</div>
</font></div>
<blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;">
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">$ ps aux | grep osu_bw</span></font></div>
<div><font color="#555555" face="Microsoft Yahei, verdana">
<div><span style="font-size: 12px; line-height: 19px;">test 15290 0.0 0.0 0 0 ? Z 21:44 0:00 [mpiblcr] <defunct></span></div>
<div><span style="font-size: 12px; line-height: 19px;"><br>
</span></div>
</font></div>
</blockquote>
<font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;">I don't know how to identify this problem. I also see someone had the same problem like me several years ago
<a href="http://trac.mpich.org/projects/mpich/ticket/1144">#1144</a>. But, there are no solutions.<br>
</span></font>
<div><font color="#555555" face="Microsoft Yahei, verdana">
<div style="font-size: 12px; line-height: 19px;"><br>
</div>
</font></div>
<blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;"><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;"><br>
</span></font></blockquote>
<div><font color="#555555" face="Microsoft Yahei, verdana"><span style="font-size: 12px; line-height: 19px;"><br>
</span></font></div>
</div>
<br>
<br>
<span title="neteasefooter"><span id="netease_mail_footer"></span></span>_______________________________________________<br>
discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss">https://lists.mpich.org/mailman/listinfo/discuss</a></div>
</blockquote>
</div>
<br>
</div>
</div>
</body>
</html>