[mpich-discuss] MPI_Comm_split and end with Finalize

Oden, Lena loden at anl.gov
Mon Feb 29 18:04:38 CST 2016


A few more points from my side to MPI_Comm_disconnect

1.) MPI_Comm_disconnect should be called from all processes in the communicator, it is a collective operation
2.) You have to disconnect/ all communicators your server is connected to.

Insofar as I understand your problem: If you have connected multiple clients and the server into one intra-communicator,
you first should call com-split to get a communicator without the server.

However, you still have to disconnect the old communicator (on all processes) - called "intra comm” in your example.

Besides, you also should disconnect all inter-communicator(s) created with MPI_Comm_connect and MPI_Comm_accept.

Lena


On Feb 29, 2016, at 9:00 AM, K. N. Ramachandran <knram06 at gmail.com<mailto:knram06 at gmail.com>> wrote:

Hello all,

I had tried just calling MPI_Comm_disconnect instead of MPI_Comm_split.

I had tried this on just the server side, as well as both on the server and client, but I still see the issue of busy-wait at MPI_Finalize on the server side. Can anyone give any further inputs on this?

It looks like the server process should be able to terminate early, but is held up by the client, even though they should be disconnected from each other.



On Sat, Feb 27, 2016 at 7:00 PM, K. N. Ramachandran <knram06 at gmail.com<mailto:knram06 at gmail.com>> wrote:
Hi Pavan,

Thank you for the reply. I have presented only a very simplified case of one server and one client and that is why the problem looks strange.

The general case is one server acting as a meeting point and N clients join the server and one intra comm is formed among them all. Then the server splits off and terminates, leaving the intracomm and then letting the clients work amongst themselves now.

I had also tried MPI_Comm_disconnect on the server, after calling MPI_Comm_split, but even in that case, the server busy-waits for the client at Finalize. The single server and single client was only to demonstrate the problem I am facing.

Please let me know if you might need more information. Thanks.

On Sat, Feb 27, 2016 at 11:59 AM, Balaji, Pavan <balaji at anl.gov<mailto:balaji at anl.gov>> wrote:

It's unclear what exactly you are trying to do here.  Why are the clients connecting to the server and then immediately "splitting off"?

Your "split-off" functionality needs to be implemented using MPI_Comm_disconnect, not using MPI_Comm_split.  Comm_split divides a communicator into smaller communicators, but all processes are still very much connected.  So as long as the server process is connected to the client processes, it might still receive messages from the client process and thus cannot simply exit.  Comm_disconnect, on the other hand, disconnects the client processes from the server processes.

But then again, I have no idea why you are connecting to the server and disconnecting immediately.

  -- Pavan

> On Feb 26, 2016, at 5:31 PM, K. N. Ramachandran <knram06 at gmail.com<mailto:knram06 at gmail.com>> wrote:
>
> Hello all,
>
> I have recently begun working on a project that uses MPICH-3.2 and I am trying to resolve an issue where a server process busy waits at MPI_Finalize.
>
> We are trying to create a server process that accepts incoming connections from a known number of clients (say, N clients), forms a new communicator amongst everyone (server and clients) and then splits itself from the group and terminates, so that the clients now only work with each other.
>
> For very problem specific reasons, we cannot do
> 'mpiexec -np N (other args)'
>
> So we have a server that publishes a service name to a nameserver and clients lookup the name to join the server. The server and client processes are started with separate calls to mpiexec, one to start the server and the rest N calls to start the clients.
>
> The server process busy-waits at the MPI_Finalize call, after it splits from the communicator and only finishes when all other clients reach their MPI_Finalize too.
>
> Consider a simplified case of only one server and one client. The simplified pseudocode is:
>
> Server process:
> MPI_Init();
> MPI_Open_port(...);
> MPI_Publish_name(...); //publish service name to nameserver
>
> MPI_accept(...); // accept incoming connections and store into intercomm
> MPI_Intercomm_merge(...);  // merge new client into intra-comm
>
> // now split the server from the client
> MPI_Comm_rank(intra comm, rank); // rank=0
> MPI_Comm_split(intra comm, (rank==0), rank, lone comm);
>
> MPI_Finalize(); // busy-waits here till client's sleep duration
>
> Client process: (simplified - assuming only one client is trying to connect)
> MPI_Init();
> MPI_Lookup_name(..);
> MPI_Connect(...)
>
> // merge
> MPI_Intercomm_merge(...); // merge with server
>
> // get rank and split
> MPI_Comm_rank(intra comm, rank);  // rank=1
> MPI_Comm_split(intra comm, rank==0, rank, lone comm);
>
> sleep(10); // sleep for 10 seconds - causes server to busy wait at MPI_Finalize for sleep duration
>
> MPI_Finalize(); // server and client finish here
>
> So my questions are:
>
> 1) Is busy-wait at MPI_Finalize the expected behaviour?
>
> 2) How to truly "disconnect" the server, so that it can end immediately at MPI_Finalize()? I had tried MPI_Comm_disconnect (also MPI_Comm_free) on both the server and client, but that didn't help.
>
> 3)  We don't want to see the server process consuming one core at 100% while it waits at MPI_Finalize. Are other alternatives apart from making the server process sleep, wakeup and keep polling a client, and then finally call MPI_Finalize?
>
> Thank you for any inputs that you can give here.
>
>
> Regards,
> K.N.Ramachandran
> _______________________________________________
> discuss mailing list     discuss at mpich.org<mailto:discuss at mpich.org>
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list     discuss at mpich.org<mailto:discuss at mpich.org>
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss



Regards,
K.N.Ramachandran



Regards,
K.N.Ramachandran
_______________________________________________
discuss mailing list     discuss at mpich.org<mailto:discuss at mpich.org>
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20160301/1f5cf593/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list