[mpich-discuss] MPI_Intercomm_create() for merging two spawned groups
wbland at anl.gov
Fri Sep 26 12:55:12 CDT 2014
You’re right, that does look the same. I’ll mark the new one as a duplicate then.
> On Sep 26, 2014, at 12:45 PM, Dave Goodell (dgoodell) <dgoodell at cisco.com> wrote:
> I haven't read the test case code, but based on the description in this thread I think this is probably a duplicate of ticket #1502: http://trac.mpich.org/projects/mpich/ticket/1502
> FWIW, I think that issue has since been fixed in Open MPI, though I haven't tested it myself.
> On Sep 26, 2014, at 11:29 AM, Wesley Bland <wbland at anl.gov> wrote:
>> I believe that your code is correct. I’ve gone through it to simplify things a bit (attached) and see the same errors as you. That’s probably a bug in MPICH that needs to be fixed unless someone else comes along and says that for MPI_INTERCOMM_CREATE to work, all processes must be in the same peer_comm, which doesn’t seem to be what the standard says to me. I’ll create a ticket and add you as a CC so you can keep track of things.
>> In the meantime, you can avoid this problem by using one of the other ways of setting up communication between two group of processes. You can use the connect/accept functions as per this ticket: http://trac.mpich.org/projects/mpich/ticket/495 or you can change the way you spawn processes to have all processes in MPI_COMM_WORLD be involved in spawning the new processes. I don’t know if that will actually work for your application, but it’s a stopgap measure while we fix this bug.
>>> On Sep 26, 2014, at 10:50 AM, Carsten Clauss <c.clauss at fz-juelich.de> wrote:
>>> Dear all,
>>> I have a code where two processes (forming the original MPI_COMM_WORLD) each spawn one additional child process (using MPI_COMM_SELF as spawning group).
>>> Now I want to create an intra-comm that covers all of these four processes.
>>> For doing so, I initially merge the two inter-comms resulting from the spawn calls into two new intra-comms (by using MPI_Intercomm_merge()).
>>> Then I create via MPI_intercomm_create() a new inter-comm that connects these two by using the original world communicator as peer-com.
>>> Finally, I merge the resulting inter-comm into the desired intra-comm.
>>> When using Open MPI, my code (it's derived from the MPICH test spaiccreate2.c, see attachment) works fine on my local machine.
>>> However, when running it with MPICH-3.1.2, I get the following error message:
>>> PMPI_Intercomm_create(601).....: MPI_Intercomm_create(comm=0x84000006, local_leader=1, MPI_COMM_WORLD, remote_leader=1, tag=123, newintercomm=0x7fff8323ee3c) failed
>>> MPID_GPID_ToLpidArray(461).....: Internal MPI error: Unknown gpid (1809769587)0
>>> Fatal error in PMPI_Intercomm_create: Internal MPI error!, error stack:
>>> PMPI_Intercomm_create(601).....: MPI_Intercomm_create(comm=0x84000004, local_leader=1, MPI_COMM_WORLD, remote_leader=0, tag=123, newintercomm=0x7fff6e0b4c7c) failed
>>> MPID_GPID_ToLpidArray(461).....: Internal MPI error: Unknown gpid (1607388239)0
>>> Here are my questions:
>>> 1) Is the above mentioned approach the right way to reach my goal?
>>> 2) Is the semantics of the attached code MPI compliant?
>>> 3) What is the reason for the error message when using MPICH?
>>> Thanks in advance and with kind regards,
>>> Carsten Clauss
>>> ParTec Cluster Competence Center GmbH
>>> Possartstrasse 20
>>> D-81679 Muenchen
>>> Geschäftsführer RA. Dipl.-Ing. Bernhard Frohwitter Eingetragen beim
>>> Amtsgericht München HRB 151545 Steuer-Nr. 08/32305, Ust-ID DE235527064
>>> discuss mailing list discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
More information about the discuss