[mpich-discuss] How to terminate MPI_Comm_accept

Roy, Hirak Hirak_Roy at mentor.com
Wed Oct 8 11:42:45 CDT 2014


Thanks Huiwei for trying this out.
Looks like this is not allowed. Look at Pavan's reply.

-Hirak




Hi Hirak,

I can reproduce your error with the attached program using one process:

        mpicc -g -o mpi_comm_accept mpi_comm_accept.c -pthread
        mpiexec -n 1 ./mpi_comm_accept

I found MPI_Send and MPI_Recv were not using the same communicator to communicate, that's why MPI_Recv will never receive the message. So either the communicator creation was wrong or the application was wrong.

The standard said MPI_Comm_accept and MPI_Comm_connect are used for "establishing contact between two groups of processes that do not share an existing communicator". But in this case, thread 1 and 2 do share an existing communicator and want to create a new communicator based on the common communicator. I don't know if it is allowed. If it is allowed, then MPI_Comm_accept and MPI_Comm_connect should be fixed to support multiple thread case; if it is not allowed, we may need to change the application to use another way to terminate MPI_Comm_accept.

Thanks,


-
Huiwei

On Oct 8, 2014, at 12:14 AM, Roy, Hirak <Hirak_Roy at mentor.com<https://lists.mpich.org/mailman/listinfo/discuss>> wrote:

> Hi Pavan,
>
> Here is my code for thread2 :
>
> do {
>     MPI_Comm newComm ;
>     MPI_Comm_accept (m_serverPort.c_str(), MPI_INFO_NULL, 0, MPI_COMM_SELF, &newComm);
>     Log ("Accepted a connection");
>     int buf = 0 ;
>     MPI_Status status ;
>     MPI_Recv(&buf, 1, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, newComm, &status);
>
>     if (status.MPI_TAG == MPI_MSG_TAG_NEW_CONN) {
>       m_clientComs[m_clientCount] = newComm ;
>       m_clientCount++;
>     } else if (status.MPI_TAG == MPI_MSG_TAG_SHUTDOWN) {
>       Log ("Shutdown");
>       //MPI_Comm_disconnect (&newComm);
>      Log ("Disconnect");
>       break;
>     } else {
>       Log ("Unmatched Receive");
>     }
>   } while(1) ;
>
>
> Here is my code for thread1 to terminate thread2 :
>
>   MPI_Comm newComm ;
>   MPI_Comm_connect (m_serverPort.c_str(), MPI_INFO_NULL, 0, MPI_COMM_SELF, &newComm);
>   Log ("Connect to Self");
>   int val = 0 ;
>   MPI_Request req ;
>   MPI_Send(&val, 1, MPI_INT, 0, MPI_MSG_TAG_SHUTDOWN, newComm);
>   Log ("Successful");
>   //MPI_Status stat ;
>   //MPI_Wait(&req, &stat);
>   Log ("Complete");
>
>   //MPI_Comm_disconnect(&newComm);
>
>
>
>
> The MPI_Send/Recv waits.
> I am using sock channel.
> For nemesis, I get the following crash :
> Assertion failed in file ./src/mpid/ch3/channels/nemesis/include/mpid_nem_inline.h at line 58: vc_ch->is_local
> internal ABORT - process 0
>
> I tried non-blocking send and receive followed by wait. However, that also does not solve the problem.
>
> Thanks,
> Hirak
>
>
>
> -----
>
> Hirak,
>
> Your approach should work fine.  I'm not sure what issue you are facing.  I assume thread 1 is doing this:
>
> while (1) {
>         MPI_Comm_accept(..);
>         MPI_Recv(.., tag, ..);
>         if (tag == REGULAR_CONNECTION)
>                continue;
>         else if (tag == TERMINATION) {
>                MPI_Send(..);
>                break;
>         }
> }
>
> In this case, all clients do an MPI_Comm_connect and then send a message with tag = REGULAR_CONNECTION.  When thread 2 is done with its work, it'll do an MPI_Comm_connect and then send a message with tag = TERMINATION, wait for a response from thread 1, and call finalize.
>
>   - Pavan
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20141008/47ffb8ea/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list