[mpich-discuss] MPICH interact with RTDA

Shuwei Zhao shuweizhao1991 at gmail.com
Mon Oct 29 12:09:32 CDT 2018


Hi hui

thanks for your response. I tried the manual mode on a simply hello world
mpi application, it works as expected.
You are pretty mind-reading. I have two follow up questions, one is that as
you said how can we detect launch failures?
another one is that except the launching part, all other things like fault
tolerance, recovery, etc will be all the same under manual mode as other
modes, nothing special under manual mode right?

Thanks
Shuwei

On Fri, Oct 26, 2018 at 11:29 Zhou, Hui <zhouh at anl.gov> wrote:

> In fact there is a manual mode in hydra. Try this:
>
>     mpiexec -n 2 -launcher=manual -host=A,B date
>
> It will print out the launch command lines and wait (very patiently). Then
> if you paste the proxy command line on your local host or any host on your
> network, it will continue.
>
> I guess that is all the pipelines that you can use to create your own
> wrapper scripts to work for custom environment.
>
>> Hui
>
> On Oct 26, 2018, at 8:42 AM, Zhou, Hui <zhouh at anl.gov> wrote:
>
> Having a manual mode of launching proxies sounds very interesting. I don’t
> think currently hydra supports it though.
>
> The manually launched proxies need establish communication back to
> mpiexec.hydra — I suspect the mechanism already exist. I guess the next
> question is how to reliably detect launch failures other than having
> mpiexec.hydra hanging forever — maybe that is a valid option.
>
> Hui Zhou
>
> On Oct 23, 2018, at 11:44 PM, Shuwei Zhao <shuweizhao1991 at gmail.com>
> wrote:
>
> Hi,
>
> I'm trying integrate MPICH with network submission tool called
> RTDA(RunTime Design Automation - www.rtda.com), I use the mpich-3.2.1a
> but looks like that mpich doesn't detect the resource manager and launcher,
> mpich cannot distribute jobs as expected.
>
> Does mpich support interact with RTDA resource manager and launcher?
>
> Since the hydra process manager did tight integration with SGE, LSF,
> SLURM, PBS, etc. I was thinking for platform it doesn't support - if there
> is way to submit jobs with loose integration?
> ( loose integration means that we do the job submission self and run
> hydra_pmi_proxy on each allocated node manually, instead of mpich finished
> everything under the hood.
> )
>
> Thank you very much,
> Shuwei
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20181029/7a02e90d/attachment.html>


More information about the discuss mailing list