[mpich-discuss] Internal Error: invalid error code 409e10 (Ring ids do not match)

Gus Correa gus at ldeo.columbia.edu
Mon Jun 2 17:25:37 CDT 2014


This is an old version of mpich.
Is it perhaps still using the mpd ring?
[If so, you need to start the mpd ring, if not already set,
before you launch the job. But that method was phased out.]
It may be worth updating to the latest mpich stable
and use the current mpiexec (hydra) to launch the job.

http://www.mpich.org/downloads/
http://www.mpich.org/documentation/guides/
http://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager


On 06/02/2014 06:03 PM, Lu, Huiwei wrote:
> Hi Kuba,
>
> Since it works with both Open MPI and BGP, it is most likely a problem of your MPICH installation or your platform.
>
> We have stopped supporting the Windows platform a while ago due to lack of developer resources. Please refer to our FAQ for more information:
> http://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_Why_can.27t_I_build_MPICH_on_Windows_anymore.3F
>
> If it is on windows platform, we recommend you use Microsoft MPI, which can be found here: http://msdn.microsoft.com/en-us/library/bb524831(v=vs.85).aspx
>
> We also encourage you to use the latest MPICH on Linux or OSX platforms, which can be downloaded here: http://www.mpich.org/downloads/
>
>> Huiwei
>
> On Jun 2, 2014, at 4:49 PM, Jakub Łuczyński <doubleloop at o2.pl> wrote:
>
>> I wrote my assignment using MPI, and tested it both locally on Open MPI (1.6.5) and on IBM Blue Gene/P (with mpi implementation provided by IBM). Everything worked fine. Turns out that our solutions are tested also in our labs where MPICH, is installed:
>>
>> $ mpich2version
>> MPICH2 Version:        1.4.1p1
>>
>> And when I run my solution there I get this strange error:
>> $ mpirun -n 2 msp-par.exe 10 10 1
>> Internal Error: invalid error code 409e10 (Ring ids do not match) in MPIR_Reduce_impl:1087
>> Fatal error in PMPI_Reduce: Other MPI error, error stack:
>> PMPI_Reduce(1270).....: MPI_Reduce(sbuf=0x7fff693a92e8, rbuf=0x7fff693a9300, count=1, dtype=USER<struct>, op=0x98000000, root=0, MPI_COMM_WORLD) failed
>> MPIR_Reduce_impl(1087):
>>
>> I am literally out of ideas what is wrong!
>>
>> Below source code fragments (c++):
>>
>> struct msp_solution
>> {
>>     int x1, y1, x2, y2;
>>     m_entry_t max_sum;
>>     msp_solution();
>>     msp_solution(const pair<int, int> &c1, const pair<int, int> &c2, int max_sum);
>>     friend bool operator<(const msp_solution &s1, const msp_solution &s2);
>> };
>>
>> void max_msp_solution(msp_solution *in, msp_solution *inout, int, MPI_Datatype*)
>> {
>>      *inout = max(*in, *inout);
>> }
>>
>> // somewhere in code
>> {
>>      MPI_Datatype MPI_msp_solution_t;
>>      MPI_Op max_msp_solution_op;
>>
>>      // create MPI struct from msp_solution
>>      MPI_Datatype types[] = { MPI_INT, MPI_LONG_LONG_INT };
>>      int block_lengths[] = { 4, 2 };
>>      MPI_Aint base_addr, x1_addr, max_sum_addr;
>>      MPI_Get_address(&collected_solution, &base_addr);
>>      MPI_Get_address(&collected_solution.x1, &x1_addr);
>>      MPI_Get_address(&collected_solution.max_sum, &max_sum_addr);
>>
>>      MPI_Aint displacements[] =
>>      {
>>          x1_addr - base_addr,
>>          max_sum_addr - base_addr
>>      };
>>
>>      MPI_Type_create_struct(2, block_lengths, displacements, types, &MPI_msp_solution_t);
>>      MPI_Type_commit(&MPI_msp_solution_t);
>>
>>      // max reduction function
>>      MPI_Op_create((MPI_User_function *) max_msp_solution, 1, &max_msp_solution_op);
>>
>>     ...
>>
>>      msp_solution solution, received_solution;
>>      MPI_Comm comm,
>>      ...
>>      // comm is created using MPI_Comm_split
>>      // solution is initialized
>>      MPI_Reduce(&solution, &received_solution, 1, MPI_msp_solution_t, max_msp_solution_op , 0, MPI_COMM_WORLD);
>>      // ERROR above!!!
>> }
>>
>>
>> Is there some error in this? How can I make it run?
>> P.S. MPI_Send and MPI_Recv on my struct MPI_msp_solution_t seems to work fine
>>
>> Thanks in advance!
>> Best regards,
>> Kuba
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>




More information about the discuss mailing list