[mpich-discuss] mpich2 - checkpointing error
Marcelo Paiva Ramos
marcelo.paiva at cptec.inpe.br
Mon Apr 7 06:26:37 CDT 2014
Hi,
Can you help me to solve this problem?
cat /etc/issue
CentOS release 6.5 (Final)
Kernel \r on an \m
uname -a
Linux server 2.6.32-431.11.2.el6.x86_64 #1 SMP Tue Mar 25 19:59:55 UTC
2014 x86_64 x86_64 x86_64 GNU/Linux
*INSTALL: blcr-0.8.5*
tar xzvf blcr-0.8.5.tar.gz
cd blcr-0.8.5
mkdir builddir
cd builddir
../configure --prefix=/opt/blcr
make
make install
/sbin/insmod /opt/blcr/lib/blcr/2.6.32-431.11.2.el6.x86_64/blcr_imports.ko
/sbin/insmod /opt/blcr/lib/blcr/2.6.32-431.11.2.el6.x86_64/blcr.ko
uname -r
2.6.32-431.11.2.el6.x86_64
lsmod | grep blcr
blcr 115465 0
blcr_imports 10715 1 blcr
ldconfig -p | grep blcr
libcr_run.so.0 (libc6,x86-64) => /opt/blcr/lib/libcr_run.so.0
libcr_run.so (libc6,x86-64) => /opt/blcr/lib/libcr_run.so
libcr_omit.so.0 (libc6,x86-64) => /opt/blcr/lib/libcr_omit.so.0
libcr_omit.so (libc6,x86-64) => /opt/blcr/lib/libcr_omit.so
libcr.so.0 (libc6,x86-64) => /opt/blcr/lib/libcr.so.0
libcr.so (libc6,x86-64) => /opt/blcr/lib/libcr.so
chkconfig --list | grep blcr
blcr 0:off 1:off 2:on 3:on 4:on 5:on 6:off
*INSTALL: mpich-3.1*
tar xzvf mpich-3.1.tar.gz
cd mpich-3.1
./configure --disable-fast CFLAGS=-O2 FFLAGS=-O2 CXXFLAGS=-O2
FCFLAGS=-O2 --prefix=/opt/mpich2/ CC=/opt/intel/bin/icc
FC=/opt/intel/bin/ifort F77=/opt/intel/bin/ifort --enable-checkpointing
--with-hydra-ckpointlib=blcr --with-blcr=/opt/blcr
--with-blcr-include=/opt/blcr/include --with-blcr-lib=/opt/blcr/lib
make
make install
*.bashrc*
export
PATH=$PATH:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/blcr/bin:/opt/mpich2/bin
export
LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64:/opt/intel/lib/intel64:/opt/blcr/lib:/opt/mpich2/lib
*mpiexec -info*
HYDRA build details:
Version: 3.1
Release Date: Thu Feb 20 11:41:13 CST 2014
CC: /opt/intel/bin/icc -O2
CXX: g++ -O2
F77: /opt/intel/bin/ifort -O2
F90: /opt/intel/bin/ifort -O2
Configure options: '--disable-option-checking'
'--prefix=/opt/mpich2' '--disable-fast' 'CFLAGS=-O2 -O0' 'FFLAGS=-O2
-O0' 'CXXFLAGS=-O2 ' 'FCFLAGS=-O2 ' 'CC=/opt/intel/bin/icc'
'FC=/opt/intel/bin/ifort' 'F77=/opt/intel/bin/ifort'
'--enable-checkpointing' '--with-hydra-ckpointlib=blcr'
'--with-blcr=/opt/blcr' '--with-blcr-include=/opt/blcr/include'
'--with-blcr-lib=/opt/blcr/lib' '--cache-file=/dev/null' '--srcdir=.'
'LDFLAGS= -L/opt/blcr/lib' 'LIBS=-lrt -lcr -lpthread ' 'CPPFLAGS=
-I/root/mpich-3.1/src/mpl/include -I/root/mpich-3.1/src/mpl/include
-I/root/mpich-3.1/src/openpa/src -I/root/mpich-3.1/src/openpa/src
-I/root/mpich-3.1/src/mpi/romio/include -I/opt/blcr/include'
Process Manager: pmi
Launchers available: ssh rsh fork slurm ll lsf
sge pbs manual persist
Topology libraries available: hwloc
Resource management kernels available: user slurm ll lsf sge pbs
cobalt
Checkpointing libraries available: blcr
Demux engines available: poll select
*ERROR*
mpiexec -n 1 -ckpointlib blcr -ckpoint-interval 20 -ckpoint-prefix
/home/marcelo/TESTE/ ./teste
[proxy:0:0 at server] requesting checkpoint
[proxy:0:0 at server] checkpoint completed
[proxy:0:0 at server] HYDT_ckpoint_blcr_checkpoint
(tools/ckpoint/blcr/ckpoint_blcr.c:241): Checkpointing failed. Make
sure BLCR kernel module is loaded. Unknown error 2356
[proxy:0:0 at server] ckpoint_thread (tools/ckpoint/ckpoint.c:76): blcr
checkpoint returned error
[proxy:0:0 at server] requesting checkpoint
[proxy:0:0 at server] checkpoint completed
[proxy:0:0 at server] HYDT_ckpoint_blcr_checkpoint
(tools/ckpoint/blcr/ckpoint_blcr.c:241): Checkpointing failed. Make
sure BLCR kernel module is loaded. Unknown error 2356
[proxy:0:0 at server] ckpoint_thread (tools/ckpoint/ckpoint.c:76): blcr
checkpoint returned error
Best regards,
Marcelo.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140407/e5d22d8d/attachment.html>
More information about the discuss
mailing list