[mpich-discuss] Debugging / Stopping on error

Joseph Schuchart schuchart at hlrs.de
Mon Apr 23 02:25:37 CDT 2018


Florian,

I just came across your question and it seems like you haven't received 
an answer yet so I thought I throw in my 2 cents. I usually use a 
combination of xterm and gdb for local debugging, i.e.,

mpirun -n 4 xterm -e gdb -ex r --args ./app -arg1 ...

To break at MPI errors you have to add a breakpoint for the 
(implementation specific) abort routine. IIRC in MPICH that is 
MPID_Abort (it's been a while since I last used MPICH for debugging though).

HTH,
Joseph

On 04/12/2018 10:05 AM, Florian Lindner wrote:
> Hello,
> 
> I am using mpich on Arch on my own desktop computer. Unfortunatly, I don't have Totalview or DDT at hand.
> 
> OpenMPI has a useful option to pause on an MPI error:
> 
> mpirun --mca opal_abort_delay 600 -n 4 ...
> 
> so that I can attach with a debugger and backtrace up into my app. Is there something like that for mpich?
> 
> What other recommendation could you give for debugging MPI applications (not mpich itself), especially for MPI Ports. I
> have re-compiled with --enable-g=all.
> 
>   setenv MPICH_DBG FILE
>   setenv MPICH_DBG_LEVEL VERBOSE
> 
> works, but produces too detailed output, even when $MPICH_DBG_TYPICAL is defined, and also too far away from my
> application code.
> 
> Thanks,
> Florian
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
> 


-- 
Dipl.-Inf. Joseph Schuchart
High Performance Computing Center Stuttgart (HLRS)
Nobelstr. 19
D-70569 Stuttgart

Tel.: +49(0)711-68565890
Fax: +49(0)711-6856832
E-Mail: schuchart at hlrs.de
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list