[mpich-discuss] Client hangs if server dies in dynamic process management

Roy, Hirak Hirak_Roy at mentor.com
Tue Nov 18 12:49:36 CST 2014


Hi Huiwei,


1>    Did you start your nameserver ?

2>    Did the server program crash?
I see the same hang (incomplete MPI_Finalize in client).

Here is my command line:

Ø  hydra_namserver &

Ø  mpiexec -n 1 -nameserver <hostname> ./server

Ø  mpiexec -n 1 -nameserver <hostname> ./client



MPICH Version:                 3.2a2
MPICH Release date:       Sun Nov 16 11:09:31 CST 2014
MPICH Device:                  ch3:sock
MPICH configure:             --prefix /home/hroy/local//mpich-3.2a2/linux_x86_64 --disable-f77 --disable-fc --disable-f90modules --disable-cxx --enable-fast=nochkmsg --enable-fast=notiming --enable-fast=ndebug --enable-fast=O3 --with-device=ch3:sock --enable-g=dbg --disable-fortran --without-valgrind CFLAGS=-O3 -fPIC CXXFLAGS=-O3 -fPIC
MPICH CC:          /u/prod/gnu/gcc/20121129/gcc-4.5.0-linux_x86_64/bin/gcc -O3 -fPIC   -g -O3
MPICH CXX:        no -O3 -fPIC  -g
MPICH F77:         no   -g
MPICH FC:           no   -g


Thanks,
Hirak

________________________________

Could you try with the latest mpich-3.2a2?

The client exit successfully on my Macbook with sock channel.



-

Huiwei



> On Nov 16, 2014, at 10:44 PM, Hirak Roy <hirak_roy at mentor.com<https://lists.mpich.org/mailman/listinfo/discuss>> wrote:

>

> Hi All,

>

> Here is my sample program. I am using channel sock of mpich-3.0.4.

>

> I am running it as

> > mpiexec -n 1 ./server.out

> > mpiexec -n 1 ./client.out

>

> Here my client program (client.c) hangs in MPI_Finalize.

> There is an assert in the server.c where server exits.

>

> There is no way to detect that in client.

> Even if we detect that using some timeout strategy, the client hangs in the finalize step.

> Could you please suggest what is going wrong here or is this a bug in sock channel?

>

> Thanks,

> Hirak

> <client.c><server.c>_______________________________________________

> discuss mailing list     discuss at mpich.org<https://lists.mpich.org/mailman/listinfo/discuss>

> To manage subscription options or unsubscribe:

> https://lists.mpich.org/mailman/listinfo/discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20141118/1c89bc65/attachment.html>


More information about the discuss mailing list