[mpich-discuss] spurious lock ups on collective merge intercom

Dmitriy Lyubimov dlieu.7 at gmail.com
Tue Feb 7 16:56:01 CST 2017


On Tue, Feb 7, 2017 at 2:08 PM, Dmitriy Lyubimov <dlieu.7 at gmail.com> wrote:

>
>
> On Tue, Feb 7, 2017 at 11:15 AM, Balaji, Pavan <balaji at anl.gov> wrote:
>
>> Hi Dmitriy,
>>
>> You should use publish/lookup name for this, instead of relying on manual
>> printing.  I've attached updated server and client codes that do that.
>> Please see attached.
>>
>> You'll need to start the Hydra nameserver for this using:
>>
>> % hydra_nameserver
>>
>> Once the nameserver has started, you can connect all mpiexecs to it using
>> something like:
>>
>> % mpiexec -nameserver localhost ./server 2
>>
>
> Thanks but we have our own name resolution architecture (which is not
> manual printing/pasting). This is simple example that's been asked for to
> isolated the problem and provided, not the actual code.
>
> The name is guaranteed to be delivered to client verbatim down to the bit.
> The use is fully consistent with the MPI 3 spec.
>

Sorry for reiterating on this point. Just checked again, here's the quote
from 3.1 MPI standard, MPI_COMM_Connect():

"port_name is the address of the server. It must be the same as the name
returned by MPI_OPEN_PORT on the server."

So, the standard is very clear about it.

For as long as the port name is faithfully communicated from process that
opened it to every other process, it does not have to be obtained via any
strictly prescribed means of redistribution.

And any error in the port name should just raise MPI_ERR_PORT. Which is not
what we observe, so I am afraid using or not using MPI_Lookup_name is not
the problem / a solution of this issue.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20170207/787a4be0/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list