[mpich-discuss] MPI_Comm_split and end with Finalize
K. N. Ramachandran
knram06 at gmail.com
Tue Mar 1 09:35:17 CST 2016
Hello Pavan,
I have to try the outline as you have suggested and it could be more
efficient as well. Since the startup process is currently working, I will
keep this in mind, in case we need a better implementation.
Thank you for the suggestions.
On Tue, Mar 1, 2016 at 12:51 AM, Balaji, Pavan <balaji at anl.gov> wrote:
>
> I think you are overcomplicating the problem. If I understand this
> correctly, you only need the server in order to give a common "port" for
> all the clients to connect to. That's fine. But I don't understand why
> you are doing an intercomm merge after connecting. I'd recommend this (for
> N clients):
>
> 1. Client 1 connects to server. Server asks it to open a new port.
> Client 1 does so and tells the server. Server disconnects from client 1.
>
> 2. For each client from 2 to N, client connects to server. Server asks it
> connect to client 1's new port. Server disconnects from client.
>
> 3. Once it has given this information to all clients, server finalizes.
>
> -- Pavan
>
> > On Feb 27, 2016, at 6:00 PM, K. N. Ramachandran <knram06 at gmail.com>
> wrote:
> >
> > Hi Pavan,
> >
> > Thank you for the reply. I have presented only a very simplified case of
> one server and one client and that is why the problem looks strange.
> >
> > The general case is one server acting as a meeting point and N clients
> join the server and one intra comm is formed among them all. Then the
> server splits off and terminates, leaving the intracomm and then letting
> the clients work amongst themselves now.
> >
> > I had also tried MPI_Comm_disconnect on the server, after calling
> MPI_Comm_split, but even in that case, the server busy-waits for the client
> at Finalize. The single server and single client was only to demonstrate
> the problem I am facing.
> >
> > Please let me know if you might need more information. Thanks.
> >
> > On Sat, Feb 27, 2016 at 11:59 AM, Balaji, Pavan <balaji at anl.gov> wrote:
> >
> > It's unclear what exactly you are trying to do here. Why are the
> clients connecting to the server and then immediately "splitting off"?
> >
> > Your "split-off" functionality needs to be implemented using
> MPI_Comm_disconnect, not using MPI_Comm_split. Comm_split divides a
> communicator into smaller communicators, but all processes are still very
> much connected. So as long as the server process is connected to the
> client processes, it might still receive messages from the client process
> and thus cannot simply exit. Comm_disconnect, on the other hand,
> disconnects the client processes from the server processes.
> >
> > But then again, I have no idea why you are connecting to the server and
> disconnecting immediately.
> >
> > -- Pavan
> >
> > > On Feb 26, 2016, at 5:31 PM, K. N. Ramachandran <knram06 at gmail.com>
> wrote:
> > >
> > > Hello all,
> > >
> > > I have recently begun working on a project that uses MPICH-3.2 and I
> am trying to resolve an issue where a server process busy waits at
> MPI_Finalize.
> > >
> > > We are trying to create a server process that accepts incoming
> connections from a known number of clients (say, N clients), forms a new
> communicator amongst everyone (server and clients) and then splits itself
> from the group and terminates, so that the clients now only work with each
> other.
> > >
> > > For very problem specific reasons, we cannot do
> > > 'mpiexec -np N (other args)'
> > >
> > > So we have a server that publishes a service name to a nameserver and
> clients lookup the name to join the server. The server and client processes
> are started with separate calls to mpiexec, one to start the server and the
> rest N calls to start the clients.
> > >
> > > The server process busy-waits at the MPI_Finalize call, after it
> splits from the communicator and only finishes when all other clients reach
> their MPI_Finalize too.
> > >
> > > Consider a simplified case of only one server and one client. The
> simplified pseudocode is:
> > >
> > > Server process:
> > > MPI_Init();
> > > MPI_Open_port(...);
> > > MPI_Publish_name(...); //publish service name to nameserver
> > >
> > > MPI_accept(...); // accept incoming connections and store into
> intercomm
> > > MPI_Intercomm_merge(...); // merge new client into intra-comm
> > >
> > > // now split the server from the client
> > > MPI_Comm_rank(intra comm, rank); // rank=0
> > > MPI_Comm_split(intra comm, (rank==0), rank, lone comm);
> > >
> > > MPI_Finalize(); // busy-waits here till client's sleep duration
> > >
> > > Client process: (simplified - assuming only one client is trying to
> connect)
> > > MPI_Init();
> > > MPI_Lookup_name(..);
> > > MPI_Connect(...)
> > >
> > > // merge
> > > MPI_Intercomm_merge(...); // merge with server
> > >
> > > // get rank and split
> > > MPI_Comm_rank(intra comm, rank); // rank=1
> > > MPI_Comm_split(intra comm, rank==0, rank, lone comm);
> > >
> > > sleep(10); // sleep for 10 seconds - causes server to busy wait at
> MPI_Finalize for sleep duration
> > >
> > > MPI_Finalize(); // server and client finish here
> > >
> > > So my questions are:
> > >
> > > 1) Is busy-wait at MPI_Finalize the expected behaviour?
> > >
> > > 2) How to truly "disconnect" the server, so that it can end
> immediately at MPI_Finalize()? I had tried MPI_Comm_disconnect (also
> MPI_Comm_free) on both the server and client, but that didn't help.
> > >
> > > 3) We don't want to see the server process consuming one core at 100%
> while it waits at MPI_Finalize. Are other alternatives apart from making
> the server process sleep, wakeup and keep polling a client, and then
> finally call MPI_Finalize?
> > >
> > > Thank you for any inputs that you can give here.
> > >
> > >
> > > Regards,
> > > K.N.Ramachandran
> > > _______________________________________________
> > > discuss mailing list discuss at mpich.org
> > > To manage subscription options or unsubscribe:
> > > https://lists.mpich.org/mailman/listinfo/discuss
> >
> > _______________________________________________
> > discuss mailing list discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> >
> >
> >
> > Regards,
> > K.N.Ramachandran
> > _______________________________________________
> > discuss mailing list discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
Regards,
K.N.Ramachandran
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20160301/5ed9960b/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list