<meta http-equiv="Content-Type" content="text/html; charset=utf-8"><div dir="ltr">Thanks Sangmin, sure, I will check it out.</div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jan 5, 2016 at 10:47 AM, Seo, Sangmin <span dir="ltr"><<a href="mailto:sseo@anl.gov" target="_blank">sseo@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word">
Hi Mohammad,
<div><br>
</div>
<div>I was wrong in my answer. The same issue was discussed last year, and the problem was fixed after mpich-3.1.3. Please refer <a href="http://lists.mpich.org/pipermail/discuss/2015-January/003660.html" target="_blank">http://lists.mpich.org/pipermail/discuss/2015-January/003660.html</a></div>
<div><br>
</div>
<div>Can you try your code with a recent version of mpich?</div>
<div><br>
</div>
<div>Regards,</div>
<div>Sangmin</div><div><div class="h5">
<div><br>
</div>
<div><br>
<div>
<blockquote type="cite">
<div>On Jan 4, 2016, at 8:09 PM, Seo, Sangmin <<a href="mailto:sseo@anl.gov" target="_blank">sseo@anl.gov</a>> wrote:</div>
<br>
<div>
<div style="word-wrap:break-word">
<div>Hi Mohammad,</div>
<div><br>
</div>
<div>It seems the same port name can be used only once, since the MPI 3.1 standard (p. 419, line 31) says "A port name may be reused after it is freed with MPI_CLOSE_PORT and released by the system.” Can you try close the port and open it again to
establish a new connection? If it doesn’t work, could you send us your actual code (if possible, please send us a simplified version of your code)?</div>
<div><br>
</div>
<div>Regards,</div>
<div>Sangmin</div>
<div><br>
</div>
<br>
<div>
<blockquote type="cite">
<div>On Dec 29, 2015, at 11:44 AM, Mohammad Javad Rashti <<a href="mailto:mjrashti@gmail.com" target="_blank">mjrashti@gmail.com</a>> wrote:</div>
<br>
<div>
<div dir="ltr">Hi,
<div>Using mpich-3.1.2, we are trying to create a multi-process multi-node MPI job with the client-server model but we are having issues creating the global communicator we need.</div>
<div><br>
</div>
<div>We cannot use mpiexec to launch the MPI processes; they are launched by a different daemon and we want them to join a group and use MPI after they are launched.<br>
</div>
<div>
<div>We chose to use a server to publish a name/port and wait on a known number of clients to connect. The goal is to create an intracommunicator among all the clients and the server, and start normal MPI communication (not sure whether there is a
better way to accomplish this goal?).</div>
<div><b><br>
</b></div>
<div><b>The problem </b>is that the first client connects fine, but the subsequent clients block. </div>
</div>
<div><br>
</div>
<div>The <b>simplified method </b>that we are using is here:</div>
<div><br>
</div>
<div><b> ------------------- Server -----------------</b></div>
<div><br>
</div>
<div>- Call <i>MPI_Open_port(MPI_INFO_NULL, port_name)</i></div>
<div><i><br>
</i></div>
<div>- Call <i>MPI_Publish_name(service_name, MPI_INFO_NULL, port_name)</i><br>
</div>
<div><br>
</div>
<div>- clients = 0</div>
<div><br>
</div>
<div>Loop until clients = MAX_CLIENTS:</div>
<div><br>
</div>
<div> if ( !clients )</div>
<div> - Call <i>MPI_Comm_accept(port_name,MPI_INFO_NULL,0,MPI_COMM_FELF,&new_ircomm)</i><br>
</div>
<div><br>
</div>
<div> else</div>
<div> - Call <i>MPI_Comm_accept(port_name,MPI_INFO_NULL,0,previous_iacomm,&new_ircomm)</i></div>
<div> </div>
<div> - Call <i>MPI_Intercomm_merge(new_ircomm, 0, &new_iacomm)</i></div>
<div><i><br>
</i></div>
<div> - previous_iacomm = new_iacomm</div>
<div><br>
</div>
<div> - clients ++</div>
<div><br>
</div>
<div>end Loop</div>
<div><br>
</div>
<div><b>---------------- Client ---------------</b></div>
<div><br>
</div>
<div>- Call <i>MPI_Lookup_name(service_name, MPI_INFO_NULL, port_name)</i></div>
<div><br>
</div>
<div>- Call <i>MPI_Comm_connect(port_name, MPI_INFO_NULL, 0, MPI_COMM_SELF, &new_ircomm)</i></div>
<div><br>
</div>
<div>- Call <i>MPI_Intercomm_merge(new_ircomm, 1 , &new_iacomm)</i><br>
</div>
<div><br>
</div>
<div>- previous_iacomm <i>= </i>new_iacomm</div>
<div><br>
</div>
<div>Loop for all clients connecting after me:</div>
<div><br>
</div>
<div> - Call <i>MPI_Comm_accept(port_name,MPI_INFO_NULL,0,previous_iacomm,&new_ircomm)</i></div>
<div><br>
</div>
<div> - Call <i>MPI_Intercomm_merge(new_ircomm, 0, &new_iacomm)</i></div>
<div> </div>
<div> - previous_iacomm = new_iacomm</div>
<div><br>
</div>
<div>end Loop</div>
<div><br>
</div>
<div>----------------------------------</div>
<div><br>
</div>
<div><b>Note </b>that MPI report states that MPI_Comm_accept is collective over the calling communicator, that's why we are calling it by the server and all previously connected clients.</div>
<div><br>
</div>
<div><b>The problem we are having </b>is that the first client connects fine, but the subsequent clients block on MPI_Comm_connect. Also the server and previously connected clients block on MPI_Comm_accept.<br>
</div>
<div>
<div> </div>
<div>(the server does not block only if we use MPI_COMM_SELF for all accept calls, but that does not help us creating the global intracomm that we want.</div>
</div>
<div><br>
</div>
<div>I suspect that we are missing something in our usage of MPI_Comm_accept. Any insight is helpful and appreciated. I can send the actual C code if needed.</div>
<div><br>
</div>
<div>Thanks</div>
<div>Mohammad</div>
</div>
_______________________________________________<br>
discuss mailing list <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a></div>
</blockquote>
</div>
<br>
</div>
_______________________________________________<br>
discuss mailing list <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a></div>
</blockquote>
</div>
<br>
</div>
</div></div></div>
<br>_______________________________________________<br>
discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" rel="noreferrer" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br></blockquote></div><br></div>