[mpich-discuss] mpich hangs

Gustavo Correa gus at ldeo.columbia.edu
Thu Jun 27 23:45:54 CDT 2013


Although this may not be really an MPI or MPICH suggestions, there they go.
Check if you're not running out of memory while HPL is running.
The maximum problem size (N) you can solve depends on the memory size.
There are many HPL formulas (and even calculators) on the web for N(RAM).
Also, set the stack size limit to a large number (or to unlimited).
Most Linux distributions come with a low default value.
I hope this helps,
Gus Correa

On Jun 27, 2013, at 11:49 PM, Syed. Jahanzeb Maqbool Hashmi wrote:

> Yes I agree. Thanks for the help :)
> 
> On Friday, June 28, 2013, Jeff Hammond wrote:
> If CPI runs and your code doesn't, it's an app issue. You said this was HPL? Ask UTK for support with this. It's their code. HPL is dirt simple so I guess you are running it incorrectly. 
> 
> Jeff
> 
> Sent from my iPhone
> 
> On Jun 27, 2013, at 10:36 PM, "Syed. Jahanzeb Maqbool Hashmi" <jahanzeb.maqbool at gmail.com> wrote:
> 
>> and here is that output:
>> 
>> Process 0 of 8 is on weiser1
>> Process 1 of 8 is on weiser1
>> Process 2 of 8 is on weiser1
>> Process 3 of 8 is on weiser1
>> Process 4 of 8 is on weiser2
>> Process 5 of 8 is on weiser2
>> Process 6 of 8 is on weiser2
>> Process 7 of 8 is on weiser2
>> pi is approximately 3.1415926544231247, Error is 0.0000000008333316
>> wall clock time = 0.018203
>> 
>> ---------------
>> 
>> 
>> On Fri, Jun 28, 2013 at 12:35 PM, Syed. Jahanzeb Maqbool Hashmi <jahanzeb.maqbool at gmail.com> wrote:
>> Yes I am successfully able to run cpi program. No such error at all. 
>> 
>> 
>> 
>> On Fri, Jun 28, 2013 at 12:31 PM, Jeff Hammond <jeff.science at gmail.com> wrote:
>> Can you run the cpi program?  If that doesn't run, something is wrong,
>> because that program is trivial and correct.
>> 
>> Jeff
>> 
>> On Thu, Jun 27, 2013 at 10:29 PM, Syed. Jahanzeb Maqbool Hashmi
>> <jahanzeb.maqbool at gmail.com> wrote:
>> > again that same error:
>> > Fatal error in PMPI_Wait: A process has failed, error stack:
>> > PMPI_Wait(180)............: MPI_Wait(request=0xbebb9a1c, status=0xbebb99f0)
>> > failed
>> > MPIR_Wait_impl(77)........:
>> > dequeue_and_set_error(888): Communication error with rank 4
>> >
>> > here is the verbose output:
>> >
>> > --------------START------------------
>> >
>> > host: weiser1
>> > host: weiser2
>> >
>> > ==================================================================================================
>> > mpiexec options:
>> > ----------------
>> >   Base path: /mnt/nfs/install/mpich-install/bin/
>> >   Launcher: (null)
>> >   Debug level: 1
>> >   Enable X: -1
>> >
>> >   Global environment:
>> >   -------------------
>> >     TERM=xterm
>> >     SHELL=/bin/bash
>> >
>> > XDG_SESSION_COOKIE=218a1dd8e20ea6d6ec61475b00000019-1372384778.679329-1845893422
>> >     SSH_CLIENT=192.168.0.3 57311 22
>> >     OLDPWD=/mnt/nfs/jahanzeb/bench/hpl/hpl-2.1
>> >     SSH_TTY=/dev/pts/0
>> >     USER=linaro
>> >
>> > LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35
>>  :*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36:
>> >     LD_LIBRARY_PATH=:/mnt/nfs/install/mpich-install/lib
>> >     MAIL=/var/mail/linaro
>> >
>> > PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/mnt/nfs/install/mpich-install/bin
>> >     PWD=/mnt/nfs/jahanzeb/bench/hpl/hpl-2.1/bin/armv7-a
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss




More information about the discuss mailing list