[mpich-discuss] [MPICH users] Problems using MPICH instead of OpenMPI

Kenneth Raffenetti raffenet at mcs.anl.gov
Fri Feb 3 10:32:52 CST 2017


Hi Lisa,

This is indeed a bug in MPICH. Hostnames in the hostfile should work. I 
have an open issue to examine the hostname/ip resolution code in MPICH 
to make sure it works as expected. Glad you were able to get working in 
the meantime.

Ken

On 01/31/2017 02:16 AM, Fischer wrote:
> Hi Ken,
>
> thanks a lot for your help! I used the correct mpiexec, so that wasn´t
> the problem.
> I do not understand why, but I just changed my hostfile using the IP
> addresses of the machines directly instead of the actual hostname and
> that worked.
> Do you know if this is the general case for MPICH or can this be caused
> by some network settings or installation settings?
> I am confused since using the hostname for OpenMPI worked just fine.
>
> Thanks for your fast reply and help!
> Lisa
>
> Am 30.01.2017 um 16:27 schrieb Kenneth Raffenetti:
>> One thing to be careful of is that you are not mixing an mpiexec from
>> openmpi with an application linked with mpich. This can cause issues
>> with processes being able to connect to each other, which is likely
>> what is happening in your runs.
>>
>> Please double check that you are using the mpiexec from mpich.
>>
>> Ken
>>
>> On 01/28/2017 04:37 AM, lisa.fischer at zib.de wrote:
>>> Dear all,
>>>
>>> I want to use MPICH instead of OpenMPI and have some difficulties.
>>> Using OpenMPI worked fine.
>>>
>>> I run my program on two different machines and set the
>>> MPICH_PORT_RANGE to
>>> that of OpenMPI, but it seems like my program is running into a
>>> deadlock.
>>> Sometimes I do not get any error message and sometimes I get the
>>> following:
>>>
>>> Assertion failed in file src/mpid/ch3/src/ch3u_handle_connection.c at
>>> line
>>> 325: vc->state == MPIDI_VC_STATE_ACTIVE
>>> internal ABORT - process 0
>>>
>>> Do I need to change the port range of MPICH? Is there a function like
>>> ompi_info to figure out the port range for MPICH as well?
>>> Or do you think there is another problem? Do I need to set other
>>> environment variables as well?
>>>
>>> The ssh connection on both machines works. I also used a test program
>>> using only send and receive calls which led to the same problem.
>>>
>>> I would be very happy for any hints.
>>>
>>> Thank you and have a good weekend!
>>> Lisa
>>>
>>> _______________________________________________
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list