[mpich-discuss] Slave hosts panic after mpirun
Vaibhav Rekhate
rekhatevm11.comp at coep.ac.in
Sat Mar 28 03:09:36 CDT 2015
I'll try to capture a photo next time. As screenshot won't work, I'll have to grab the image with a camera.
Regards,
Vaibhav Rekhate
College of Engineering, Pune
On Fri, Mar 27, 2015 at 9:10 AM -0700, "Kenneth Raffenetti" <raffenet at mcs.anl.gov<mailto:raffenet at mcs.anl.gov>> wrote:
Are you able to capture the debug or panic output from the nodes before
they reboot? It is difficult to diagnose the issue without that information.
On 03/27/2015 10:52 AM, Vaibhav Rekhate wrote:
> I have setup a cluster on Ubuntu 12.04 machines (using the steps
> outlined here: https://help.ubuntu.com/community/MpichCluster)
>
> The environment variable PATH is set. The code is compiled on the master
> node and executed on the master node. After execution of the code, the
> slave nodes crash, i.e. the slaves start rebooting, debug statements on
> the display indicate kernel panic. The slaves have to be restarted. This
> is not exactly reproducible as it does not happen every time, but sometimes.
>
>
> Regards,
> Vaibhav Rekhate
> B. Tech, Computer Engineering
> College of Engineering, Pune
> ------------------------------------------------------------------------
> *From:* Huiwei Lu <huiweilu at mcs.anl.gov>
> *Sent:* 27 March 2015 07:41 PM
> *To:* discuss at mpich.org
> *Subject:* Re: [mpich-discuss] Slave hosts panic after mpirun
> Hi Vaibhav,
>
> Can you give us some details of how the slaves panic?
>
> --
> Huiwei Lu
>
> On Fri, Mar 27, 2015 at 2:19 AM, Vaibhav Rekhate
> <rekhatevm11.comp at coep.ac.in <mailto:rekhatevm11.comp at coep.ac.in>> wrote:
>
> Hello there!
> I have setup a small cluster (only 3 nodes - 1 master, 2 slaves).
> I was trying to run a sample program by using this command:
>
> $MPI_INSTALL_DIR/bin/mpirun -n 9 --machinefile machinefile ./a.out
>
>
> Contents of machine file:
>
> host1:3
> host2:3
> host3:3
>
>
> Please help.
>
> Regards,
> Vaibhav Rekhate
> _______________________________________________
> discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150328/41ae91b2/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list