<div dir="ltr"><span style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px">Hi,</span><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px">I am doing a bit of experimentation with the goal of getting MPI to run on top of Apache YARN. I know that a few others have written here looking for help with mpich2-yarn, and the strangely unreleased hamster project on the hadoop JIRA. I'm not interested in those things. </div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px">I am writing this note to document my progress so far, and get some confirmation that what I am doing is considered a "supported" mode of operation.</div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px">For context, within YARN, a java based process called a "YARN Application Master: submits requests for resources to the YARN ResourceManager and launches "YARN Containers" via its own process launcher. There can be many AppMasters and each of them may do different things. </div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px">As a proof of concept, I want to make a given Application Master request N containers, and within each of them start the individual mpi processes. </div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px">I'm used to well, honestly, never dealing with any of this, and showing up at some cluster where SLURM (or whatever) already exists, and all I need to do is just write code compile it and 'qsub.' So all of this is a learning experience.</div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px">Looking at Hydra, it seems that the intention is for Hydra to start processes, and oddly (and surprising to me) it is designed to need to _ask_ a RM for resources. With different logic for different RM. There is not a huge amount of documentation here, and so I was largely flying blind. I was expecting that a RM just starts processes on machines, and wire up just happens via some set of environment variable commands and shell commands, and perhaps black magic.</div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px">After some googling, and private discussion with Jeff Hammond, he pointed me and the -launcher manual flag for mpirun. <br></div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px">By issuing: </div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(33,33,33)"><div style="font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px"></div><ol style="font-family:"liberation sans","lucida grande","luxi sans","bitstream vera sans",helvetica,verdana,arial,sans-serif;font-size:12px"><li style="vertical-align:top"><div style="font-stretch:normal;font-size:1em;line-height:1.2em;font-family:monospace;margin:0px;padding:0px;background-image:none;background-position:initial;background-repeat:initial;background-color:initial;vertical-align:top">rlewis@skynet03 build]$ mpirun -np 2 -launcher manual -hosts skynet01,skynet02 a.out</div></li></ol><div><span style="font-family:"helvetica neue",helvetica,arial,sans-serif">I was able to get these two </span><span style="font-family:"helvetica neue",helvetica,arial,sans-serif">hydra_pmi_proxy command lines, which after running both on the two machines, roughly seem to make my mpi program execute normally. </span><font face="monospace"><span style="font-size:12px"><br></span></font></div><ol style="font-family:"liberation sans","lucida grande","luxi sans","bitstream vera sans",helvetica,verdana,arial,sans-serif;font-size:12px"><li style="vertical-align:top"><div style="font-stretch:normal;font-size:1em;line-height:1.2em;font-family:monospace;margin:0px;padding:0px;background-image:none;background-position:initial;background-repeat:initial;background-color:initial;vertical-align:top">HYDRA_LAUNCH: /usr/lib64/<span class="inbox-inbox-lG" style="background-color:rgba(251,246,167,0.498039);outline:transparent dashed 1px">mpich</span>/bin/hydra_pmi_proxy --control-port skynet03:58584 --rmk user --launcher manual --demux poll --pgid 0 --retries 10 --usize -2 --proxy-id 0</div></li><li style="vertical-align:top"><div style="font-stretch:normal;font-size:1em;line-height:1.2em;font-family:monospace;margin:0px;padding:0px;background-image:none;background-position:initial;background-repeat:initial;background-color:initial;vertical-align:top">HYDRA_LAUNCH: /usr/lib64/<span class="inbox-inbox-lG" style="background-color:rgba(251,246,167,0.498039);outline:transparent dashed 1px">mpich</span>/bin/hydra_pmi_proxy --control-port skynet03:58584 --rmk user --launcher manual --demux poll --pgid 0 --retries 10 --usize -2 --proxy-id 1</div></li><li style="vertical-align:top"><div style="font-stretch:normal;font-size:1em;line-height:1.2em;font-family:monospace;margin:0px;padding:0px;background-image:none;background-position:initial;background-repeat:initial;background-color:initial;vertical-align:top">HYDRA_LAUNCH_END</div></li></ol><span style="font-family:"helvetica neue",helvetica,arial,sans-serif">However, using MPI_Send, I would then see this occur:</span><br></div><div style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif;font-size:13px"><ol style="font-family:"liberation sans","lucida grande","luxi sans","bitstream vera sans",helvetica,verdana,arial,sans-serif;font-size:12px"><li style="vertical-align:top"><div style="font-stretch:normal;font-size:1em;line-height:1.2em;font-family:monospace;margin:0px;padding:0px;background-image:none;background-position:initial;background-repeat:initial;background-color:initial;vertical-align:top">Fatal error in MPI_Send: A process has failed, error stack:</div></li><li style="vertical-align:top"><div style="font-stretch:normal;font-size:1em;line-height:1.2em;font-family:monospace;margin:0px;padding:0px;background-image:none;background-position:initial;background-repeat:initial;background-color:initial;vertical-align:top">MPI_Send(171)..............: MPI_Send(buf=0x7fff2e38b044, count=1, MPI_INT, dest=1, tag=0, MPI_COMM_WORLD) failed</div></li><li style="vertical-align:top"><div style="font-stretch:normal;font-size:1em;line-height:1.2em;font-family:monospace;margin:0px;padding:0px;background-image:none;background-position:initial;background-repeat:initial;background-color:initial;vertical-align:top">MPID_nem_tcp_connpoll(1833): Communication error with rank 1: Connection refused</div></li><li style="vertical-align:top"><div style="font-stretch:normal;font-size:1em;line-height:1.2em;font-family:monospace;margin:0px;padding:0px;background-image:none;background-position:initial;background-repeat:initial;background-color:initial;vertical-align:top"> </div></li><li style="vertical-align:top"><div style="font-stretch:normal;font-size:1em;line-height:1.2em;font-family:monospace;margin:0px;padding:0px;background-image:none;background-position:initial;background-repeat:initial;background-color:initial;vertical-align:top">=====================================================================</div></li></ol><p>It seems that when I add the option: `-disable-hostname-propagation` the underlying code seems to work. I'm not exactly sure if this is an accident.</p><p>However, assuming that this is all I need, it seems that essentially each YARN container needs to execute these command lines:</p><p> <span style="font-family:monospace;font-size:12px">/usr/lib64/<span class="inbox-inbox-lG" style="background-color:rgba(251,246,167,0.498039);outline:transparent dashed 1px">mpich</span>/bin/hydra_pmi_proxy --control-port skynet03:58584 --rmk user --launcher manual --demux poll --pgid 0 --retries 10 --usize -2 --proxy-id 0</span></p><p>Which they can get from starting the mpi control process on the machine which runs the YARN Application Master. </p><p>And then they will all just work. Is this accurate? Is this a "supported" mode of operation? this certainly is an extremely easy way to get MPI to run on top of YARN, with zero code change necessary to the<span class="inbox-inbox-Apple-converted-space"> </span><span class="inbox-inbox-lG" style="background-color:rgba(251,246,167,0.498039);outline:transparent dashed 1px">MPICH</span><span class="inbox-inbox-Apple-converted-space"> </span>codebase. I'm not sure how portable (across MPI implementations) this is though, but, for now I don't care.</p><p>Best,</p><p>-rhl</p></div></div>