[mpich-discuss] Slave hosts panic after mpirun

Kenneth Raffenetti raffenet at mcs.anl.gov
Fri Mar 27 11:10:52 CDT 2015


Are you able to capture the debug or panic output from the nodes before 
they reboot? It is difficult to diagnose the issue without that information.

On 03/27/2015 10:52 AM, Vaibhav Rekhate wrote:
> I have setup a cluster on Ubuntu 12.04 machines (using the steps
> outlined here: https://help.ubuntu.com/community/MpichCluster)
>
> The environment variable PATH is set. The code is compiled on the master
> node and executed on the master node. After execution of the code, the
> slave nodes crash, i.e. the slaves start rebooting, debug statements on
> the display indicate kernel panic. The slaves have to be restarted. This
> is not exactly reproducible as it does not happen every time, but sometimes.
>
>
> Regards,
> Vaibhav Rekhate
> B. Tech, Computer Engineering
> College of Engineering, Pune
> ------------------------------------------------------------------------
> *From:* Huiwei Lu <huiweilu at mcs.anl.gov>
> *Sent:* 27 March 2015 07:41 PM
> *To:* discuss at mpich.org
> *Subject:* Re: [mpich-discuss] Slave hosts panic after mpirun
> Hi Vaibhav,
>
> Can you give us some details of how the slaves panic?
>
> --
> Huiwei Lu
>
> On Fri, Mar 27, 2015 at 2:19 AM, Vaibhav Rekhate
> <rekhatevm11.comp at coep.ac.in <mailto:rekhatevm11.comp at coep.ac.in>> wrote:
>
>     Hello there!
>     I have setup a small cluster (only 3 nodes - 1 master, 2 slaves).
>     I was trying to run a sample program by using this command:
>
>          $MPI_INSTALL_DIR/bin/mpirun -n 9 --machinefile machinefile ./a.out
>
>
>     Contents of machine file:
>
>          host1:3
>          host2:3
>          host3:3
>
>
>     Please help.
>
>     Regards,
>     Vaibhav Rekhate
>     _______________________________________________
>     discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
>     To manage subscription options or unsubscribe:
>     https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list