[mpich-discuss] How to make a non-MPI process to become a MPI process of MPI_COMM_WORLD?
haozi
yidanyiji at 163.com
Sun Jan 25 05:04:14 CST 2015
Hello, Balaji.
The example from OpenMPI is attached.
Thank you very much.
At 2015-01-25 03:16:35, "Balaji, Pavan" <balaji at anl.gov> wrote:
>
>You are missing some files needed to run your program. Also, can you please attach files instead of copy-pasting code into the email?
>
> -- Pavan
>
>> On Jan 24, 2015, at 7:57 AM, haozi <yidanyiji at 163.com> wrote:
>>
>> Hi.
>> The following server/client example is selected from OpenMPI.
>> I compile and run the example by using OpenMPI, everything is OK.
>> BUT, I compile and run it by mpich(3.1.3), which is BLOCKED as my example!
>>
>> This is a bug?
>>
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <string.h>
>> #include <errno.h>
>> #include <unistd.h>
>> #include <mpi.h>
>>
>> /*
>>
>> LOGIC:
>>
>> - the 'server' opens a port and write the info to a file
>> - the 'clients' open the file and connect to the port
>> - after each accept, the server and client do a merge to
>> convert the intercomm to an intracomm
>>
>> DETAIL STEPS:
>>
>> - server open port
>> - server does accept
>> - client #1 does connect
>> - server and client #1 do merge
>> - server does accept
>> - client #2 does connect
>> - server, client #1 and client #2 do merge
>> - server does accept
>> - client #3 does connect
>> - server, client #1, client #2 and client #3 do merge
>>
>> */
>>
>> #define TAG 0
>>
>> #define CHK(code) do \
>> { \
>> int retval = code ; \
>> if (retval != MPI_SUCCESS) \
>> { \
>> fprintf(stderr, "Error: " #code "\n") ; \
>> exit(1) ; \
>> } \
>> } while(0)
>>
>> int main(int argc, char *argv[])
>> {
>> char hostname[255] ;
>> char buff[255] ;
>>
>> int role ;
>> int num_clients ;
>> int size, rank ;
>>
>> FILE *fp ;
>> char server_port_name[MPI_MAX_PORT_NAME] ;
>>
>> MPI_Comm intercomm, intracomm ;
>> MPI_Status status ;
>> int msg_count ;
>> int i ;
>>
>> /* sanity check the args */
>> if(argc != 3)
>> {
>> fprintf(stderr, "usage %s <num clients> <1:server | 0:client>\n", argv[0]) ;
>> exit(1) ;
>> }
>>
>> num_clients = atoi(argv[1]) ;
>> role = atoi(argv[2]) ;
>>
>> if (num_clients <= 0 || (role != 0 && role != 1))
>> {
>> fprintf(stderr, "usage %s <num clients> <1:server | 0:client>\n", argv[0]) ;
>> exit(1) ;
>> }
>>
>> /* initialize MPI */
>> CHK(MPI_Init(&argc, &argv)) ;
>>
>> /* get the node name */
>> {
>> int retval = gethostname(hostname, 255) ;
>> if(retval == -1)
>> {
>> fprintf(stderr, "gethostname failed: %s\n", strerror(errno)) ;
>> exit(1) ;
>> }
>> }
>>
>> /* server */
>> if(role == 1)
>> {
>> printf("SERVER: on node '%s'\n", hostname) ;
>>
>> /* open port to establish connections */
>> CHK(MPI_Open_port(MPI_INFO_NULL, server_port_name)) ;
>>
>> printf("SERVER: opened port=%s\n", server_port_name) ;
>>
>> /* store the port name */
>> fp = fopen("server_port_name.txt", "w") ;
>> if(fp == NULL)
>> {
>> fprintf(stderr, "fopen failed: %s\n", strerror(errno)) ;
>> exit(1) ;
>> }
>> fprintf(fp, "%s", server_port_name) ;
>> fclose(fp) ;
>>
>> /* the server accepts connections from all the clients */
>> for(i = 0 ; i < num_clients ; i++ )
>> {
>> /* accept connections at this port */
>> CHK(MPI_Comm_accept(server_port_name, MPI_INFO_NULL, 0,
>> i == 0 ? MPI_COMM_WORLD : intracomm,
>> &intercomm)) ;
>>
>> printf("SERVER: accepted connection from client %d\n", i+1) ;
>>
>> /* merge, to form one intra communicator */
>> CHK(MPI_Intercomm_merge(intercomm, 0, &intracomm)) ;
>>
>> printf("SERVER: merged with client %d\n", i+1) ;
>>
>> CHK(MPI_Comm_size(intracomm, &size)) ;
>> CHK(MPI_Comm_rank(intracomm, &rank)) ;
>>
>> printf("SERVER: after merging with client %d: size=%d rank=%d\n", i+1, size, rank) ;
>> }
>> } /* end server */
>>
>> /* client */
>> if(role == 0)
>> {
>> printf("CLIENT: on node '%s'\n", hostname) ;
>>
>> fp = fopen("server_port_name.txt", "r") ;
>> if(fp == NULL)
>> {
>> fprintf(stderr, "fopen failed: %s\n", strerror(errno)) ;
>> exit(1) ;
>> }
>> fscanf(fp, "%s", server_port_name) ;
>> fclose(fp) ;
>>
>> printf("CLIENT: attempting to connect to server on port=%s\n", server_port_name) ;
>>
>> /* connect to the server */
>> CHK(MPI_Comm_connect (server_port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &intercomm)) ;
>>
>> printf("CLIENT: connected to server on port\n") ;
>>
>> /* merge the server and client to one intra communicator */
>> CHK(MPI_Intercomm_merge(intercomm, 1, &intracomm)) ;
>>
>> printf("CLIENT: merged with existing intracomm\n") ;
>>
>> CHK(MPI_Comm_size(intracomm, &size)) ;
>> CHK(MPI_Comm_rank(intracomm, &rank)) ;
>>
>> printf("CLIENT: after merging, new comm: size=%d rank=%d\n", size, rank) ;
>>
>> for (i = rank ; i < num_clients ; i++)
>> {
>> /* client performs a collective accept */
>> CHK(MPI_Comm_accept(server_port_name, MPI_INFO_NULL, 0, intracomm, &intercomm)) ;
>>
>> printf("CLIENT: connected to server on port\n") ;
>>
>> /* merge the two intra comms back to one communicator */
>> CHK(MPI_Intercomm_merge(intercomm, 0, &intracomm)) ;
>>
>> printf("CLIENT: merged with existing members\n") ;
>>
>> CHK(MPI_Comm_size(intracomm, &size)) ;
>> CHK(MPI_Comm_rank(intracomm, &rank)) ;
>>
>> printf("CLIENT: new size after merging with existing members: size=%d rank=%d\n", size, rank) ;
>> }
>>
>> } /* end client */
>>
>> CHK(MPI_Comm_size(intracomm, &size)) ;
>> CHK(MPI_Comm_rank(intracomm, &rank)) ;
>>
>> printf("After fusion: size=%d rank=%d\n", size, rank) ;
>>
>> if(rank == 0)
>> {
>> msg_count = num_clients ;
>>
>> while(msg_count)
>> {
>> CHK(MPI_Recv(buff, 255, MPI_CHAR, MPI_ANY_SOURCE,
>> MPI_ANY_TAG, intracomm, &status)) ;
>>
>> printf("Received hello msg from '%s'\n", buff) ;
>> msg_count-- ;
>> }
>> }
>> else
>> {
>> /* all ranks > 0 */
>>
>> CHK(MPI_Send(hostname, strlen(hostname) + 1, MPI_CHAR, 0, TAG, intracomm)) ;
>> }
>>
>> CHK(MPI_Finalize()) ;
>>
>> fprintf(stderr, "Rank %d is exiting\n", rank);
>> return 0 ;
>> }
>>
>>
>> At 2015-01-24 10:43:07, "haozi" <yidanyiji at 163.com> wrote:
>> Thanks, Bland and Lu.
>>
>> You are right.
>> These functions (such as MPI_Comm_accept, MPI_Comm_connect, MPI_Intercomm_merge) can help me to get a new intra-communicator which contains ALL MPI processes.
>>
>> Now, I have a more complicated example:
>> I have a server and a client.
>> After they merge into an intra-communicator by using connect/accept/merge fucntions, another client would plant to join them, too.
>> I thought that the code is simular, but the code CAN'T work: Second client CAN'T connect. The new comm CAN'T accept. They all BLOCK.
>>
>> As you see, the second client BLOCKs at MPI_Comm_connect, and processes of newcomm BLOCK at MPI_Comm_accept.
>> What's Wrong with my code?
>>
>> //server
>> #include "mpi.h"
>> int main(int argc, char *argv[])
>> {
>> MPI_Comm client, client2, newcomm, newcomm2;
>> MPI_Status status;
>> char port_name[MPI_MAX_PORT_NAME];
>> char port_name2[MPI_MAX_PORT_NAME];
>> int size, again, rank, myrank;
>>
>> MPI_Init(&argc, &argv);
>> MPI_Open_port(MPI_INFO_NULL, port_name);//OK
>>
>> MPI_Comm_accept(port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD,&client);//OK
>>
>> MPI_Intercomm_merge(client,11,&newcomm);//OK
>>
>> MPI_Barrier(newcomm);//OK
>>
>> MPI_Open_port(MPI_INFO_NULL, port_name2);// OK
>>
>> MPI_Comm_accept(port_name2, MPI_INFO_NULL, 0, newcomm,&client2);// BLOCK here, Wath's wrong?
>>
>> MPI_Intercomm_merge(client2,12,&newcomm2);
>>
>> MPI_Barrier(newcomm2);
>>
>> MPI_Close_port(port_name);
>> MPI_Comm_disconnect(&client);
>> MPI_Close_port(port_name2);
>> MPI_Comm_disconnect(&client2);
>>
>> MPI_Finalize();
>> return 0;
>> }
>>
>> //first client
>> #include "mpi.h"
>> int main( int argc, char **argv )
>> {
>> MPI_Comm server,newcomm,newcomm2,client2;
>> char port_name[MPI_MAX_PORT_NAME];
>> char port_name2[MPI_MAX_PORT_NAME];
>> int size,rank;
>>
>> MPI_Init( &argc, &argv );
>>
>> strcpy( port_name, argv[1] );
>> MPI_Comm_connect( port_name, MPI_INFO_NULL, 0, MPI_COMM_SELF,&server );//OK
>>
>> MPI_Intercomm_merge(server,11,&newcomm);//OK
>>
>> MPI_Barrier(newcomm);//OK
>>
>> MPI_Open_port(MPI_INFO_NULL, port_name2);//OK
>>
>> MPI_Comm_accept(port_name2, MPI_INFO_NULL, 0, newcomm,&client2);// BLOCK here, Wath's wrong?
>>
>> MPI_Intercomm_merge(client2,12,&newcomm2);
>>
>> MPI_Barrier(newcomm2);
>>
>> MPI_Close_port(port_name2);
>> MPI_Comm_disconnect(&client2);
>> MPI_Comm_disconnect( &server );
>> MPI_Finalize();
>> return 0;
>> }
>>
>> //second client
>> #include "mpi.h"
>> int main( int argc, char **argv )
>> {
>> MPI_Comm server,newcomm;
>> char port_name[MPI_MAX_PORT_NAME];
>> int size,rank;
>>
>> MPI_Init( &argc, &argv );
>>
>> strcpy( port_name, argv[1] );//OK
>>
>> MPI_Comm_connect( port_name, MPI_INFO_NULL, 0, MPI_COMM_SELF,&server );// BLOCK here, Wath's wrong?
>>
>> MPI_Comm_size(MPI_COMM_WORLD, &size);
>>
>> MPI_Intercomm_merge(server,11,&newcomm);
>>
>> MPI_Barrier(newcomm);
>>
>> MPI_Comm_disconnect( &server );
>> MPI_Finalize();
>> return 0;
>> }
>>
>>
>>
>>
>> At 2015-01-23 23:14:06, "Wesley Bland" <wbland at anl.gov> wrote:
>> The size of MPI_COMM_WORLD will never change. That communicator is set at initialization time and is not ever modified. However, when you finish connect/accept, you get a new communicator which you can merge into an intra-communicator which functions exactly like MPI_COMM_WORLD. Don’t be attached to the variable name MPI_COMM_WORLD. It’s just a variable and you can use a different one with very little extra work.
>>
>> Thanks,
>> Wesley
>>
>>
>> On Thu, Jan 22, 2015 at 6:14 PM, haozi <yidanyiji at 163.com> wrote:
>> Thanks, Lu.
>> My simple code is as following.
>> //server
>> #include "mpi.h"
>>
>> int main(int argc, char *argv[])
>> {
>> MPI_Comm client;
>> MPI_Status status;
>> char port_name[MPI_MAX_PORT_NAME];
>> int size, again;
>>
>> MPI_Init(&argc, &argv);
>> MPI_Comm_size(MPI_COMM_WORLD, &size);
>> MPI_Open_port(MPI_INFO_NULL, port_name);
>> printf("server port_name is %s\n\n", port_name);
>> MPI_Comm_accept(port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD,&client);
>> MPI_Comm_size(MPI_COMM_WORLD, &size);
>> printf("At server, comm_size=%d @ MPI_COMM_WORLD=%x, Client_World=%x\n",size,MPI_COMM_WORLD, client);
>> MPI_Comm_size(client, &size);
>> printf("At server, client_size=%d @ MPI_COMM_WORLD=%x, Client_World=%x\n",size,MPI_COMM_WORLD, client);
>>
>> MPI_Comm_disconnect(&client);
>> MPI_Finalize();
>> return 0;
>> }
>>
>> //client
>> #include "mpi.h"
>>
>> int main( int argc, char **argv )
>> {
>> MPI_Comm server;
>> char port_name[MPI_MAX_PORT_NAME];
>> int size;
>>
>> MPI_Init( &argc, &argv );
>> strcpy( port_name, argv[1] );
>> printf("server port name:%s\n",port_name);
>> MPI_Comm_connect( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD,&server );
>> MPI_Comm_size(MPI_COMM_WORLD, &size);
>> printf("At client, comm_size=%d @ MPI_COMM_WORLD=%x, Server_World=%x\n",size,MPI_COMM_WORLD,server);
>> MPI_Comm_size(server, &size);
>> printf("At client, server_size=%d @ MPI_COMM_WORLD=%x, Server_World=%x\n",size,MPI_COMM_WORLD,server);
>>
>> MPI_Comm_disconnect( &server );
>> MPI_Finalize();
>> return 0;
>> }
>>
>> The run command is as following.
>> mpiexec -n 1 ./server
>> mpiexec -n 1 ./client
>>
>> BUT, the SIZE is 1, NOT 2.
>>
>> My question is as before: How does the size of MPI_COMM_WORLD change to 2 ?
>>
>>
>> At 2015-01-23 00:30:43, "Huiwei Lu" <huiweilu at mcs.anl.gov> wrote:
>> You may take a look at MPI_Comm_accept and MPI_Comm_connect, which will connect a new client process to a server process. See Chap. 10 of MPI 3.0 standard (www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf) for a detail example.
>>
>> --
>> Huiwei Lu
>> Postdoc Appointee
>> Mathematics and Computer Science Division
>> Argonne National Laboratory
>> http://www.mcs.anl.gov/~huiweilu/
>>
>> On Thu, Jan 22, 2015 at 9:41 AM, haozi <yidanyiji at 163.com> wrote:
>> Hi, guys.
>>
>> This web page (http://wiki.mpich.org/mpich/index.php/PMI_v2_Design_Thoughts) says:
>> Singleton init. This is the process by which a program that was not started with mpiexec can become an MPI process and make use of all MPI features, including MPI_Comm_spawn, needs to be designed and documented, with particular attention to the disposition of standard I/O. Not all process managers will want to or even be able to create a new mpiexec process, so this needs to be negotiated. Similarly, the dispostion of stdio needs to be negotiated between the singleton process and the process manager. To address these issues, a new singleton init protocol has been implemented and tested with the gforker process manager.
>>
>> I am very interested in this function.
>> Can this function solve the following question:
>> At beginning, the MPI job uses the mpiexec commond to start three MPI processes. That is to say, there are three MPI processes in MPI_COMM_WORLD. At some time, the job find itself to need another MPI process to cooperate the three MPI processes. So the question is: Could PMI help an non-MPI process to become a MPI process of the current MPI_COMM_WORLD? That is to say, Could the non-MPI process use the PMI function to become a member process of the current MPI job which would have FOUR MPI processes in MPI_COMM_WORLD?
>>
>> Is there some method to solve question?
>> Is anybody have some example?
>>
>> Thandks!!!
>>
>>
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>--
>Pavan Balaji ✉️
>http://www.mcs.anl.gov/~balaji
>
>_______________________________________________
>discuss mailing list discuss at mpich.org
>To manage subscription options or unsubscribe:
>https://lists.mpich.org/mailman/listinfo/discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150125/d646be19/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: singleton_client_server.c
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150125/d646be19/attachment.c>
-------------- next part --------------
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list