<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
<div>Can you run dmesg on the node of rank 1, which is killed by signal 9, after you execute your application? You can find the reason that the process is killed at the end of dmesg output, e.g., out of memory.</div>
<div><br>
</div>
<div> Sangmin</div>
<div><br>
</div>
<br>
<div>
<div>On Sep 14, 2014, at 12:37 PM, Abhishek Bhat <<a href="mailto:abhat@trinityconsultants.com">abhat@trinityconsultants.com</a>> wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<div lang="EN-US" link="blue" vlink="purple" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">
<div class="WordSection1" style="page: WordSection1;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Because the application works when less intensive runs and fails for more intensive runs, it is likely that the application is requesting too many resources. When\where should I run ulimit a and
dmesg, after I get the error? If that is true, is there any way to change the environment in MPI to increase the capacity so that the increased resources can be accommodated?<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">If I run it in new terminal here is what I get<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">core file size (blocks, -c) 0<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">data seg size (kbytes, -d) unlimited<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">scheduling priority (-e) 0<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">file size (blocks, -f) unlimited<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">pending signals (-i) 250598<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">max locked memory (kbytes, -l) 64<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">max memory size (kbytes, -m) unlimited<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">open files (-n) 1024<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">pipe size (512 bytes, -p) 8<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">POSIX message queues (bytes, -q) 819200<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">real-time priority (-r) 0<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">stack size (kbytes, -s) 10240<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">cpu time (seconds, -t) unlimited<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">max user processes (-u) 1024<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">virtual memory (kbytes, -v) unlimited<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">file locks (-x) unlimited<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">in my job, I try to set the stack size to unlimited but I guess it is not working.<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Let me know. Thank you for all the help.<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Abhishek<o:p></o:p></span></div>
<div>
<div style="margin: 0in 0in 0.0001pt 0.75pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif; color: rgb(0, 64, 128);">
.<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt 0.75pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<b><span style="font-size: 11pt; font-family: Cambria, serif; color: rgb(0, 64, 128);">Abhishek Bhat, PhD, EPI,<br>
</span></b><span style="font-size: 11pt; font-family: Cambria, serif; color: rgb(0, 64, 128);">Senior Consultant<o:p></o:p></span></div>
<div style="margin: 0in 0in 0.0001pt 0.75pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(0, 64, 128);"> </span></div>
</div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span></div>
<div>
<div style="border-style: solid none none; border-top-color: rgb(225, 225, 225); border-top-width: 1pt; padding: 3pt 0in 0in;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<b><span style="font-size: 11pt; font-family: Calibri, sans-serif;">From:</span></b><span style="font-size: 11pt; font-family: Calibri, sans-serif;"><span class="Apple-converted-space"> </span>Seo, Sangmin [<a href="mailto:sseo@anl.gov" style="color: purple; text-decoration: underline;">mailto:sseo@anl.gov</a>]<span class="Apple-converted-space"> </span><br>
<b>Sent:</b><span class="Apple-converted-space"> </span>Sunday, September 14, 2014 11:16 AM<br>
<b>To:</b><span class="Apple-converted-space"> </span><<a href="mailto:discuss@mpich.org" style="color: purple; text-decoration: underline;">discuss@mpich.org</a>><br>
<b>Subject:</b><span class="Apple-converted-space"> </span>Re: [mpich-discuss] Error Running MPICH for Photochemical Modeling<o:p></o:p></span></div>
</div>
</div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<o:p> </o:p></div>
<div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
Abhishek,<o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<o:p> </o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
Signal 9 is caused by many reasons, e.g., CPU time, out of memory, etc., but it is mostly because the application requests too many resources. You can check the environment settings with ulimit -a. And, you may find some information about your error from dmesg.<o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<o:p> </o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
Thanks,<o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
Sangmin<o:p></o:p></div>
</div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<o:p> </o:p></div>
</div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<o:p> </o:p></div>
<div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
On Sep 12, 2014, at 5:51 PM, Abhishek Bhat <<a href="mailto:abhat@trinityconsultants.com" style="color: purple; text-decoration: underline;">abhat@trinityconsultants.com</a>> wrote:<o:p></o:p></div>
</div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<br>
<br>
<o:p></o:p></div>
<blockquote style="margin-top: 5pt; margin-bottom: 5pt;">
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Sangmin.</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">I updated to mpich3 and getting the following error</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Fatal error in MPI_Recv: A process has failed, error stack:</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">MPI_Recv(187).............: MPI_Recv(buf=0x7fff93840c30, count=644490, MPI_REAL, src=1, tag=14131, MPI_COMM_WORLD, status=0x7fff94444f20) failed</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">dequeue_and_set_error(865): Communication error with rank 1</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">rank 1 in job 1 dfw-camx_55000 caused collective abort of all ranks</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> exit status of rank 1: killed by signal 9</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Same situation, successful runs for smaller resource runs and for up to 7 processes. Error at more than 7. Here is the mpich command I am using to run from my job file
</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">cat << ieof > nodes</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">dfw-camx:1</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">dfw-camx-n1:1</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">dfw-camx-n2:1</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">dfw-camx-n3:1</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">dfw-camx-n4:1</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">dfw-camx-n5:1</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">dfw-camx-n6:1</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">dfw-camx-n7:1</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">ieof</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">set NUMPROCS =<span class="apple-converted-space"> </span><span style="color: red;">8</span></span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">set RING = `wc -l nodes | awk '{print $1}'`</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">mpdboot -n $RING -f nodes verbose</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">if( ! { mpiexec -machinefile nodes -np $NUMPROCS $EXEC } ) then</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> mpdallexit</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> exit</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">endif</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">For a successful run the NUMPROCS has to be < = 7.</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Any help is much appreciated.</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Thank You</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Abhishek</span><o:p></o:p></div>
</div>
<div>
<div style="margin-left: 0.75pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif; color: rgb(0, 64, 128);">
.</span><o:p></o:p></div>
</div>
<div style="margin-left: 0.75pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<b><span style="font-size: 11pt; font-family: Cambria, serif; color: rgb(0, 64, 128);">Abhishek Bhat, PhD, EPI,<br>
</span></b><span style="font-size: 11pt; font-family: Cambria, serif; color: rgb(0, 64, 128);">Senior Consultant</span><o:p></o:p></div>
</div>
<div style="margin-left: 0.75pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(0, 64, 128);"> </span><o:p></o:p></div>
</div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="border-style: solid none none; border-top-color: rgb(225, 225, 225); border-top-width: 1pt; padding: 3pt 0in 0in;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<b><span style="font-size: 11pt; font-family: Calibri, sans-serif;">From:</span></b><span class="apple-converted-space"><span style="font-size: 11pt; font-family: Calibri, sans-serif;"> </span></span><span style="font-size: 11pt; font-family: Calibri, sans-serif;">Seo,
Sangmin [<a href="mailto:sseo@anl.gov" style="color: purple; text-decoration: underline;">mailto:sseo@anl.gov</a>]<span class="apple-converted-space"> </span><br>
<b>Sent:</b><span class="apple-converted-space"> </span>Friday, September 12, 2014 1:11 PM<br>
<b>To:</b><span class="apple-converted-space"> </span><<a href="mailto:discuss@mpich.org" style="color: purple; text-decoration: underline;">discuss@mpich.org</a>><br>
<b>Subject:</b><span class="apple-converted-space"> </span>Re: [mpich-discuss] Error Running MPICH for Photochemical Modeling</span><o:p></o:p></div>
</div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
Hi Abhishek,<o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
Can you try with the recent MPICH release to see if the same error happens? You can download the recent release, 3.1.2, from <a href="http://www.mpich.org/downloads/" style="color: purple; text-decoration: underline;"><span style="color: purple;">http://www.mpich.org/downloads/</span></a>.<o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
Thanks,<o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
Sangmin<o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<o:p></o:p></div>
</div>
<div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<o:p></o:p></div>
</div>
<div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
On Sep 12, 2014, at 12:59 PM, Abhishek Bhat <<a href="mailto:abhat@trinityconsultants.com" style="color: purple; text-decoration: underline;"><span style="color: purple;">abhat@trinityconsultants.com</span></a>> wrote:<o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<br>
<br>
<br>
<o:p></o:p></div>
</div>
<blockquote style="margin-top: 5pt; margin-bottom: 5pt;">
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">I am running a photochemical modeling on Linux cluster (CentOS_64 bit) with 1 master and 8 slave nodes with quad core (intel i7) on each node. I have two scenarios, in first scenario, I am running
less data intensive run on all 8 nodes (NUMPROCS = 9) and the run will go fine. When running same configuration for a more intense run, I am getting following error.</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Fatal error in MPI_Recv: Other MPI error, error stack:</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">MPI_Recv(187).....................: MPI_Recv(buf=0x7fff989d53b0, count=644490, MPI_REAL, src=1, tag=14131, MPI_COMM_WORLD, status=0x7fff995d96a0) failed</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">MPIDI_CH3I_Progress(150)..........:</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">MPID_nem_mpich2_blocking_recv(948):</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">MPID_nem_tcp_connpoll(1720).......:</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">state_commrdy_handler(1556).......:</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">MPID_nem_tcp_recv_handler(1446)...: socket closed</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">rank 1 in job 1 dfw-camx_55000 caused collective abort of all ranks</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> exit status of rank 1: killed by signal 9</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">If I run the program with smaller nodes (smaller than 7 NUMPROCS) the run goes fine.</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">It appears that the rank 1 (my first node) is collectively causing all the ranks, but I could identify why. I tried following solutions </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div style="margin-left: 0.5in;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif; text-indent: -0.25in;">
<span style="font-size: 11pt; font-family: Cambria, serif;">1.</span><span style="font-size: 7pt;"> <span class="apple-converted-space"> </span></span><span style="font-size: 11pt; font-family: Cambria, serif;">Increased master memory to 32 gb</span><o:p></o:p></div>
</div>
<div style="margin-left: 0.5in;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif; text-indent: -0.25in;">
<span style="font-size: 11pt; font-family: Cambria, serif;">2.</span><span style="font-size: 7pt;"> <span class="apple-converted-space"> </span></span><span style="font-size: 11pt; font-family: Cambria, serif;">Increased all nodes memory to 32 gb</span><o:p></o:p></div>
</div>
<div style="margin-left: 0.5in;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif; text-indent: -0.25in;">
<span style="font-size: 11pt; font-family: Cambria, serif;">3.</span><span style="font-size: 7pt;"> <span class="apple-converted-space"> </span></span><span style="font-size: 11pt; font-family: Cambria, serif;">Exchanged the rank 1 to different node in
the parallel.</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">In all situations, I am getting this error. Surprisingly, when I am running smaller (less data intensive runs), I am not getting this error even if I increase the NUMPROCS to 32 processes.</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Any help will be highly appreciated.</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">I am running mpich 1.4</span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Thank You<br>
Abhishek</span><o:p></o:p></div>
</div>
<div style="margin-left: 0.75pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif; color: rgb(0, 64, 128);">
.</span><o:p></o:p></div>
</div>
<div style="margin-left: 0.75pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<b><span style="font-size: 11pt; font-family: Cambria, serif; color: rgb(0, 64, 128);">Abhishek Bhat, PhD, EPI,<br>
</span></b><span style="font-size: 11pt; font-family: Cambria, serif; color: rgb(0, 64, 128);">Senior Consultant</span><o:p></o:p></div>
</div>
<div style="margin-left: 0.75pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif; color: rgb(0, 64, 128);"> </span><o:p></o:p></div>
</div>
<div style="margin-left: 0.7pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<b><span style="font-size: 11pt; font-family: Cambria, serif; color: rgb(0, 64, 128);">Trinity Consultants</span></b><o:p></o:p></div>
</div>
<p class="MsoNormal" style="margin: 0in 0in 6pt 0.7pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">12770 Merit Drive, Suite 900 | Dallas, Texas 75251</span><o:p></o:p></p>
<div style="margin-left: 0.75pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Office: <span class="apple-converted-space"> </span><b><span style="color: rgb(194, 0, 0);">972-661-8100</span></b>| Mobile: 806-281-7617</span><o:p></o:p></div>
</div>
<div style="margin-left: 0.75pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Email: <span class="apple-converted-space"> </span><a href="mailto:abhat@trinityconsultants.com" style="color: purple; text-decoration: underline;"><span style="color: rgb(5, 99, 193);">abhat@trinityconsultants.com</span></a><u><span style="color: rgb(0, 64, 128);"> </span></u> |
LinkedIn: <a href="http://www.linkedin.com/in/abhattrinityconsultants" style="color: purple; text-decoration: underline;"><span style="color: rgb(5, 99, 193);">www.linkedin.com/in/abhattrinityconsultants</span></a></span><o:p></o:p></div>
</div>
<div style="margin-left: 0.75pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div style="margin-left: 0.75pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Stay current on environmental issues. <span class="apple-converted-space"> </span><a href="http://www.trinityconsultants.com/Subscribe/" style="color: purple; text-decoration: underline;"><span style="color: rgb(0, 64, 128);">Subscribe</span></a><span class="apple-converted-space"> </span>today
to receive Trinity's free<span class="apple-converted-space"> </span><a href="http://www.trinityconsultants.com/EnvironmentalQuarterly/" style="color: purple; text-decoration: underline;"><i><span style="color: rgb(0, 64, 128);">Environmental Quarterly</span></i></a>.</span><o:p></o:p></div>
</div>
<div style="margin-left: 0.75pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;">Learn about Trinitys<span class="apple-converted-space"> </span><a href="http://www.trinityconsultants.com/Training/" style="color: purple; text-decoration: underline;"><span style="color: rgb(0, 64, 128);">courses</span></a><span class="apple-converted-space"> </span>for
environmental professionals.</span><o:p></o:p></div>
</div>
<div style="margin-left: 0.75pt;">
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Calibri, sans-serif;"><a href="http://www.linkedin.com/company/trinity-consultants" style="color: purple; text-decoration: underline;"><span style="font-family: Cambria, serif; color: rgb(5, 99, 193); text-decoration: none;"><image001.gif></span></a></span><span style="font-size: 11pt; font-family: Cambria, serif;"> <span class="apple-converted-space"> </span></span><span style="font-size: 11pt; font-family: Calibri, sans-serif;"><a href="http://www.facebook.com/TrinityConsults" style="color: purple; text-decoration: underline;"><span style="font-family: Cambria, serif; color: rgb(5, 99, 193); text-decoration: none;"><image002.gif></span></a></span><span style="font-size: 11pt; font-family: Cambria, serif;"> </span><span style="font-size: 11pt; font-family: Calibri, sans-serif;"><a href="http://twitter.com/trinityconsults" style="color: purple; text-decoration: underline;"><span style="font-family: Cambria, serif; color: rgb(5, 99, 193); text-decoration: none;"><image003.gif></span></a></span><span style="font-size: 11pt; font-family: Cambria, serif;"> </span><span style="font-size: 11pt; font-family: Calibri, sans-serif;"><a href="http://www.youtube.com/trinityconsultants" style="color: purple; text-decoration: underline;"><span style="font-family: Cambria, serif; color: rgb(5, 99, 193); text-decoration: none;"><image004.gif></span></a></span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Cambria, serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);"><image005.jpg></span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 11pt; font-family: Calibri, sans-serif;"> </span><o:p></o:p></div>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 9pt; font-family: Helvetica, sans-serif;"><br>
_________________________________________________________________________<br>
<br>
The information transmitted is intended only for the person or entity to<br>
which it is addressed and may contain confidential and/or privileged<br>
material. Any review, retransmission, dissemination or other use of, or<br>
taking of any action in reliance upon, this information by persons or<br>
entities other than the intended recipient is prohibited. If you received<br>
this in error, please contact the sender and delete the material from any<br>
computer.<br>
_________________________________________________________________________<br>
_______________________________________________<br>
discuss mailing list <a href="mailto:discuss@mpich.org" style="color: purple; text-decoration: underline;"><span style="color: rgb(149, 79, 114);">discuss@mpich.org</span></a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" style="color: purple; text-decoration: underline;"><span style="color: rgb(149, 79, 114);">https://lists.mpich.org/mailman/listinfo/discuss</span></a></span><o:p></o:p></div>
</div>
</blockquote>
</div>
<div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<o:p></o:p></div>
</div>
</div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<span style="font-size: 9pt; font-family: Helvetica, sans-serif;"><br>
_________________________________________________________________________<br>
<br>
The information transmitted is intended only for the person or entity to<br>
which it is addressed and may contain confidential and/or privileged<br>
material. Any review, retransmission, dissemination or other use of, or<br>
taking of any action in reliance upon, this information by persons or<br>
entities other than the intended recipient is prohibited. If you received<br>
this in error, please contact the sender and delete the material from any<br>
computer.<br>
_________________________________________________________________________<br>
_______________________________________________<br>
discuss mailing list <a href="mailto:discuss@mpich.org" style="color: purple; text-decoration: underline;"><span style="color: purple;">discuss@mpich.org</span></a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" style="color: purple; text-decoration: underline;"><span style="color: purple;">https://lists.mpich.org/mailman/listinfo/discuss</span></a><o:p></o:p></span></div>
</blockquote>
</div>
<div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">
<o:p> </o:p></div>
</div>
<br>
______________________________<wbr>______________________________<wbr>_____________<br>
<br>
The information transmitted is intended only for the person or entity to<br>
which it is addressed and may contain confidential and/or privileged<br>
material. Any review, retransmission, dissemination or other use of, or<br>
taking of any action in reliance upon, this information by persons or<br>
entities other than the intended recipient is prohibited. If you received<br>
this in error, please contact the sender and delete the material from any<br>
computer.<br>
______________________________<wbr>______________________________<wbr>_____________<br>
</div>
</blockquote>
</div>
<br>
</body>
</html>