[mpich-discuss] How does MPICH spawn processes, and is there a way to configure it?

Scott O'Malley scottyomalley at gmail.com
Mon Jun 2 16:29:51 CDT 2014


After extensive research it turns out it was my slave nodes that weren't
configured correctly. Only the Master Node was able to access the Slaves so
when mpiexec told a slave to spawn a new process and error was thrown. This
wasn't being thrown with the original param I was passing as it forced the
Master Node to distribute all tasks. It was an odd one to track down but
hopefully if anyone else has this problem they can see this information here


On 2 June 2014 22:09, Lu, Huiwei <huiweilu at mcs.anl.gov> wrote:

> I am not sure what tree mode is and why you need to disable it. In general
> users does not need to specify which mode hydra should use to run MPI. You
> can simply run it using: mpiexec -f hosts -n 4 ./app, as explained in
> http://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager#Quick_Start
>
> In order to understand you failed run, can you give us your host file and
> the command line you launch you application?
>
>> Huiwei
>
> On May 29, 2014, at 9:16 AM, Scott O'Malley <scottyomalley at gmail.com>
> wrote:
>
> > I've looked through the hydra docs but I haven't found a flag or
> anything else that would let me disable it spawning in tree mode, if MPICH
> even does that.
> >
> > I was able to kind of rectify the error using OpenMPI and specifying the
> "-mca plm_rsh_no_tree_spawn" flag
> >
> >
> > On 29 May 2014 14:54, Lu, Huiwei <huiweilu at mcs.anl.gov> wrote:
> > Hi Scott,
> >
> > Check
> https://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager to
> see if it is what you want.
> >
> > Best,
> > —
> > Huiwei
> >
> > On May 29, 2014, at 6:37 AM, Scott O'Malley <scottyomalley at gmail.com>
> wrote:
> >
> > > I have recently built a cluster of Raspberry Pi computers. When this
> was made up of a master node and 2 slave nodes there was no issue calling
> mpiexec across the cluster. When I add an additional node to the cluster
> bringing the size to 1 master and 3 slave nodes I get an error of "Host
> File Verification Failed". I've tested password-less SSH on all nodes and
> it's working correctly.
> > >
> > > I've done some googling of the error and set up and the only mention
> I've gotten to it was on this link here relating to OpenMPI -
> http://www.open-mpi.org/community/lists/users/2013/11/22940.php
> > >
> > > Is there a similar setup for MPICH and if so can someone link me to
> the correct documentation page?
> > >
> > > Cheers
> > >
> > > Scott
> > > _______________________________________________
> > > discuss mailing list     discuss at mpich.org
> > > To manage subscription options or unsubscribe:
> > > https://lists.mpich.org/mailman/listinfo/discuss
> >
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> >
> >
> >
> > --
> > - Scott
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>



-- 
- Scott
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140602/d744856b/attachment.html>


More information about the discuss mailing list