[mpich-discuss] Error in MPI_Finalize on a simple ring test over TCP

Thomas Ropars thomas.ropars at epfl.ch
Wed Jul 10 09:50:36 CDT 2013


Yes, you are right, sorry for disturbing.

On 07/10/2013 03:39 PM, Wesley Bland wrote:
> The value of previous for rank 0 in your code is -1. MPICH is complaining because all of the requests to receive a message from -1 are still pending when you try to finalize. You need to make sure that you are receiving from valid ranks.
>
> On Jul 10, 2013, at 7:50 AM, Thomas Ropars <thomas.ropars at epfl.ch> wrote:
>
>> Yes sure. Here it is.
>>
>> Thomas
>>
>> On 07/10/2013 02:23 PM, Wesley Bland wrote:
>>> Can you send us the smallest chunk of code that still exhibits this error?
>>>
>>> Wesley
>>>
>>> On Jul 10, 2013, at 6:54 AM, Thomas Ropars <thomas.ropars at epfl.ch> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I get the following error when I try to run a simple application implementing a ring (each process sends to rank+1 and receives from rank-1). More precisely, the error occurs during the call to MPI_Finalize():
>>>>
>>>> Assertion failed in file src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c at line 363: sc->pg_is_set
>>>> internal ABORT - process 0
>>>>
>>>> Does anybody else also noticed the same error?
>>>>
>>>> Here are all the details about my test:
>>>> - The error is generated with mpich-3.0.2 (but I noticed the exact same error with mpich-3.0.4)
>>>> - I am using IPoIB for communication between nodes (The same thing happens over Ethernet)
>>>> - The problem comes from TCP links. When all processes are on the same node, there is no error. As soon as one process is on a remote node, the failure occurs.
>>>> - Note also that the failure does not occur if I run a more complex code (eg, a NAS benchmark).
>>>>
>>>> Thomas Ropars
>>>> _______________________________________________
>>>> discuss mailing list     discuss at mpich.org
>>>> To manage subscription options or unsubscribe:
>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>> _______________________________________________
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>>
>> <ring_clean.c>_______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
>




More information about the discuss mailing list