[mpich-discuss] how to dynamically schedule process in mpich

Shuwei Zhao shuweizhao1991 at gmail.com
Sun Nov 18 22:33:48 CST 2018


Hi, Hui,

Thanks a lot for your comments.
1. The definition of "dynamic scheduling of cpu cores" in my opinion is
that when I already submit a mpi application in the farm, during the
runtime, I want to add/reduce cpus of the running application. For example,
I use qsub to submit a 4-process application to the farm on 4 different
machines, during the runtime, I want to use 4 more process to run the
application to make it run faster. The application can spawn 4 more process
as I required, the old processes and the new processes can be in the same
comm group or in parent-son comm group.  Vice versa, if I want to reduce
the the number process of 4 to 2, during runtime, the application will
terminate 2 processes of 4 and keeps the other 2 of them running.
2. About the MPI_Info object, I tried to use mpi_comm_spawn on the same
machine, it can work. But I don't know how to spawn processes cross
machines. For example, if I have 4 processes on machine 1, and I want to
spawn 4 more processes on other machines (resource managed by qsub), how do
I need to declare in the MPI_Info object?

Thank you very much,
SHuwei

On Wed, Nov 14, 2018 at 1:42 PM Zhou, Hui <zhouh at anl.gov> wrote:

> Hi Shuwei,
>
> See my comments/questions between the quotes below:
>
>
> On Nov 13, 2018, at 12:33 AM, Shuwei Zhao <shuweizhao1991 at gmail.com>
> wrote:
>
> Hi, Hui,
>
> Thanks for your response, I have already read through the docs that you
> said. It's very useful.
> although I did some research still have some questions:
> IMHO, there are 2 possible ways to do dynamic scheduling of cpu cores:
>
>
> Could you clarify on what do you mean by “dynamic scheduling of cpu cores”?
>
> 1. using mpi_comm_spawn to spawn a new hydra_pmi_proxy on a different port
> or using mpi_comm_spawn_multiple to spawn new processes on the same port
>
>
> The port — the one shown on the hydra_pmi_proxy command line — is
> implementation details for the communication between proxy and control; it
> has nothing to do with how the process is spawned and on which host.  For
> MPICH, each host (node) will have one hydra_pmi_proxy running (spawning one
> or more executables).
>
> If you only have one single program (executable) to be run on spawned
> nodes, then you only need use `mpi_comm_spawn`. `mpi_comm_spawn_multiple`
> is used when you need run multiple different executables.
>
> To control which host to spawn the process, you pass that information
> through the MPI_Info object.
>
> but my question is that how to start the new processes on a remote
> machine, using farm resource manager(qsub,bsub,etc)? *Is there a reserved
> keyword in the MPI_info that can qualify for this function?*
>
>
> Use MPI_Info argument, which is a set of key/value hints. In particular,
> you may set `host` for where to launch the new process, or `hosfile` for a
> file which provide a list of hosts. You may also set `weir` to specify
> working directory for the new process. Without the hints, the default is
> the information you passed on the command line or set by resource manager.
>
> 2. using mpi_comm_connect to connect an already started mpi application to
> connect to another mpi application, but I didn't find some useful testcases
> yet. Do you know if this way is possible?
>
>
> Those are a set of routines to establish new communicators. You only need
> use them if the existing communicators (MPI_COMM_WORLD and the comms that
> returned from mpi_comm_spawn and mpi_comm_parent) are insufficient. They
> are independent of spawning processes.
>
> Thanks,
> Shuwei
>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20181118/8e19e7ec/attachment.html>


More information about the discuss mailing list