[mpich-discuss] Error building mpich-3.1.2 on Solaris 10 withgcc-4.9.0

Balaji, Pavan balaji at anl.gov
Sat Sep 6 17:43:17 CDT 2014


On Sep 6, 2014, at 5:35 PM, Siegmar Gross <Siegmar.Gross at informatik.hs-fulda.de> wrote:

>> Thanks, Siegmar.  Are you seeing some errors with â?omake t
>> testing� or are all tests failing?  On some platforms, s
>> some tests take longer and hence show up as â?otimeoutsâ?, but
>> they are not really hanging.  But the number of such tests
>> should be small.
> 
> I see some bus errors on Solaris 10 Sparc. I attached the log
> files for both Solaris 10 Sparc and x86_64, so that you can
> see yourself the results.

Ah, yes.  The bus errors on SPARC are a known issue:

https://trac.mpich.org/projects/mpich/ticket/1159

The reason is that SPARC is very strict with respect to datatype alignment restrictions.  The compiler is supposed to align integers at 4-byte boundaries, doubles at 8-byte boundaries, etc.  But in MPICH we sometimes use a char array and later typecast it to a structure that might contain ints and pointers, etc.  Unfortunately, this means that the compiler cannot do the appropriate alignment of the char buffer, cause bus errors when we try to access the data.

We should certainly fix this, but it’s not going to be easy (at least not without adding performance overhead).

  — Pavan

--
Pavan Balaji  ✉️
http://www.mcs.anl.gov/~balaji



More information about the discuss mailing list