[mpich-discuss] MPI_Comm_split and end with Finalize
K. N. Ramachandran
knram06 at gmail.com
Mon Feb 29 09:00:09 CST 2016
Hello all,
I had tried just calling MPI_Comm_disconnect instead of MPI_Comm_split.
I had tried this on just the server side, as well as both on the server and
client, but I still see the issue of busy-wait at MPI_Finalize on the
server side. Can anyone give any further inputs on this?
It looks like the server process should be able to terminate early, but is
held up by the client, even though they should be disconnected from each
other.
On Sat, Feb 27, 2016 at 7:00 PM, K. N. Ramachandran <knram06 at gmail.com>
wrote:
> Hi Pavan,
>
> Thank you for the reply. I have presented only a very simplified case of
> one server and one client and that is why the problem looks strange.
>
> The general case is one server acting as a meeting point and N clients
> join the server and one intra comm is formed among them all. Then the
> server splits off and terminates, leaving the intracomm and then letting
> the clients work amongst themselves now.
>
> I had also tried MPI_Comm_disconnect on the server, after calling
> MPI_Comm_split, but even in that case, the server busy-waits for the client
> at Finalize. The single server and single client was only to demonstrate
> the problem I am facing.
>
> Please let me know if you might need more information. Thanks.
>
> On Sat, Feb 27, 2016 at 11:59 AM, Balaji, Pavan <balaji at anl.gov> wrote:
>
>>
>> It's unclear what exactly you are trying to do here. Why are the clients
>> connecting to the server and then immediately "splitting off"?
>>
>> Your "split-off" functionality needs to be implemented using
>> MPI_Comm_disconnect, not using MPI_Comm_split. Comm_split divides a
>> communicator into smaller communicators, but all processes are still very
>> much connected. So as long as the server process is connected to the
>> client processes, it might still receive messages from the client process
>> and thus cannot simply exit. Comm_disconnect, on the other hand,
>> disconnects the client processes from the server processes.
>>
>> But then again, I have no idea why you are connecting to the server and
>> disconnecting immediately.
>>
>> -- Pavan
>>
>> > On Feb 26, 2016, at 5:31 PM, K. N. Ramachandran <knram06 at gmail.com>
>> wrote:
>> >
>> > Hello all,
>> >
>> > I have recently begun working on a project that uses MPICH-3.2 and I am
>> trying to resolve an issue where a server process busy waits at
>> MPI_Finalize.
>> >
>> > We are trying to create a server process that accepts incoming
>> connections from a known number of clients (say, N clients), forms a new
>> communicator amongst everyone (server and clients) and then splits itself
>> from the group and terminates, so that the clients now only work with each
>> other.
>> >
>> > For very problem specific reasons, we cannot do
>> > 'mpiexec -np N (other args)'
>> >
>> > So we have a server that publishes a service name to a nameserver and
>> clients lookup the name to join the server. The server and client processes
>> are started with separate calls to mpiexec, one to start the server and the
>> rest N calls to start the clients.
>> >
>> > The server process busy-waits at the MPI_Finalize call, after it splits
>> from the communicator and only finishes when all other clients reach their
>> MPI_Finalize too.
>> >
>> > Consider a simplified case of only one server and one client. The
>> simplified pseudocode is:
>> >
>> > Server process:
>> > MPI_Init();
>> > MPI_Open_port(...);
>> > MPI_Publish_name(...); //publish service name to nameserver
>> >
>> > MPI_accept(...); // accept incoming connections and store into intercomm
>> > MPI_Intercomm_merge(...); // merge new client into intra-comm
>> >
>> > // now split the server from the client
>> > MPI_Comm_rank(intra comm, rank); // rank=0
>> > MPI_Comm_split(intra comm, (rank==0), rank, lone comm);
>> >
>> > MPI_Finalize(); // busy-waits here till client's sleep duration
>> >
>> > Client process: (simplified - assuming only one client is trying to
>> connect)
>> > MPI_Init();
>> > MPI_Lookup_name(..);
>> > MPI_Connect(...)
>> >
>> > // merge
>> > MPI_Intercomm_merge(...); // merge with server
>> >
>> > // get rank and split
>> > MPI_Comm_rank(intra comm, rank); // rank=1
>> > MPI_Comm_split(intra comm, rank==0, rank, lone comm);
>> >
>> > sleep(10); // sleep for 10 seconds - causes server to busy wait at
>> MPI_Finalize for sleep duration
>> >
>> > MPI_Finalize(); // server and client finish here
>> >
>> > So my questions are:
>> >
>> > 1) Is busy-wait at MPI_Finalize the expected behaviour?
>> >
>> > 2) How to truly "disconnect" the server, so that it can end immediately
>> at MPI_Finalize()? I had tried MPI_Comm_disconnect (also MPI_Comm_free) on
>> both the server and client, but that didn't help.
>> >
>> > 3) We don't want to see the server process consuming one core at 100%
>> while it waits at MPI_Finalize. Are other alternatives apart from making
>> the server process sleep, wakeup and keep polling a client, and then
>> finally call MPI_Finalize?
>> >
>> > Thank you for any inputs that you can give here.
>> >
>> >
>> > Regards,
>> > K.N.Ramachandran
>> > _______________________________________________
>> > discuss mailing list discuss at mpich.org
>> > To manage subscription options or unsubscribe:
>> > https://lists.mpich.org/mailman/listinfo/discuss
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>
>
> Regards,
> K.N.Ramachandran
>
Regards,
K.N.Ramachandran
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20160229/76486b2c/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list