[mpich-discuss] MPICH interact with RTDA

Zhou, Hui zhouh at anl.gov
Mon Oct 29 13:36:18 CDT 2018


Hi Shuwei,

Once the proxy is started, it will be communicating to the control (mpiexec) and all pmi interface works as usual — i.e. nothing special.

With manual mode, the responsibility of detecting launch failures is on you (as implied by manual mode). Your custom solution need to detect launch failures, then either re-launch or abort. With re-launch, as long as the proxy eventually start to run, hydra won’t care and function the same. If you decide to abort, then I guess you can simply kill hydra.

—
Hui Zhou






On Oct 29, 2018, at 12:09 PM, Shuwei Zhao <shuweizhao1991 at gmail.com<mailto:shuweizhao1991 at gmail.com>> wrote:

Hi hui

thanks for your response. I tried the manual mode on a simply hello world mpi application, it works as expected.
You are pretty mind-reading. I have two follow up questions, one is that as you said how can we detect launch failures?
another one is that except the launching part, all other things like fault tolerance, recovery, etc will be all the same under manual mode as other modes, nothing special under manual mode right?

Thanks
Shuwei



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20181029/77f6d46d/attachment.html>


More information about the discuss mailing list