[mpich-discuss] MPI_Barrier segmentation fault

Eric A. Borisch eborisch at gmail.com
Fri May 27 00:32:55 CDT 2016


Well, this wasn't as trivial as I had hoped. With the latest version
of Xcode -- 7.3.1 (7D1014) -- it appears there is some bad code being
generated; specifically a movaps instruction with a pointer that isn't
16-byte aligned. This is different from what the buildbot [1]  -- with
Xcode 7.2 (7C68) -- generated.

_MPID_Request_create:
[....]
000000000012bcaf movq $0x0, 0x1c8(%rax)
000000000012bcba movq $0x0, 0x1c0(%rax)
000000000012bcc5 movq $0x0, 0x1b8(%rax)
000000000012bcd0 xorps %xmm0, %xmm0
000000000012bcd3 movaps %xmm0, 0x230(%rax)   <<< CRASH
000000000012bcda movq $0x0, 0x240(%rax)
000000000012bce5 movl $0x2c000000, 0x210(%rax) ## imm = 0x2C000000
000000000012bcef popq %rbp

At crash, %rax == 0x000000010af39048 + 230 = 0x10AF39258 (not 16-byte
aligned). Fall down go boom. %rax is a pointer to a new handle object
returned from MPIU_Handle_obj_alloc() if I'm reading things correctly.

vs:

_MPID_Request_create:
[...]
000000000011d3db movq $0x0, 0x1c0(%rax)
000000000011d3e6 movq $0x0, 0x1b8(%rax)
000000000011d3f1 movq $0x0, 0x240(%rax)
000000000011d3fc movq $0x0, 0x238(%rax)
000000000011d407 movq $0x0, 0x230(%rax)
000000000011d412 movl $0x2c000000, 0x210(%rax) ## imm = 0x2C000000
000000000011d41c popq %rbp

I'm not ready to dig into this further tonight, but wanted to post it
up here for Ben and awareness. I don't know if the actual error is a
"compiler error" in the emitted movaps instruction, or if a pointer
that was advertised (in the code) to be 16-byte aligned wasn't for
some reason...

(If I use the builedbot package, the test example works; if I use the
local build, I get a crash just like Ben.)

If someone on the MPICH team has the latest XCode and could verify or
refute, that would be helpful. We're at n=2/2 so far...

 - Eric

[1] http://packages.macports.org/mpich-default/mpich-default-3.2_0+gcc5.darwin_15.x86_64.tbz2

On Thu, May 26, 2016 at 8:53 AM, Eric A. Borisch <eborisch at gmail.com> wrote:
> On Thu, May 26, 2016 at 7:21 AM, Ben Whale <ben at benwhale.com> wrote:
>> Since I really expected this code to work I suspect that I’ve installed
>> mpich incorrectly.
>>
>> I’m on OSX 10.11.4 and installed mpich using the command sudo port install
>> mpich
>
> Ben,
>
> Output of "which mpicc", "which mpiexec", and "otool -L
> minimal_working_example"? I'm hopeful there will be something
> interesting from one of those; we can likely take this off-list unless
> others would like to chime in.
>
>  - Eric (I maintain the MacPorts mpich ports.)



More information about the discuss mailing list