[MLton] latest MLton segfault in gmp

Wesley W. Terpstra wesley at terpstra.ca
Sat Oct 10 14:10:22 PDT 2009


On Sat, Oct 10, 2009 at 10:27 PM, Wesley W. Terpstra <wesley at terpstra.ca>wrote:

> I've tried compiling with -align 8 and then it works... I'm not sure this
> is a solution, though; it may have just masked the problem.
>

Found the smoking gun! Debian builds gmp with -O3 whereas I used -O2 for
MinGW32. If you look at the assembler output of mpz/mul_exp.c with the two
options you will notice a difference... the introduction of a 'movdqa'
instruction, which is an SSE2 instruction that expects 16-byte alignment.

>From what I've read, an array of 64-bit words should be 64-bit aligned.
MLton IntInfs are such arrays and must thus be 8-byte aligned. They aren't.

Here's the problem vectorized assembler from gcc with -O3 (I've marked the
problem code):

.LVL16:
        andl    $15, %eax
        shrq    $3, %rax
^^^^^^^^^^^ This ignores the 4-byte alignment of the array, only caring
about it's 8-byte alignment before it moves on to doing 16-byte aligned
moves.
        cmpq    %r12, %rax
        cmova   %r12, %rax
        testq   %rax, %rax
        je      .L10
.LBB2:
        cmpq    %rax, %r12
        movq    $0, (%r14)
        leaq    8(%r14), %rdi
        leaq    -1(%r12), %rsi
        je      .L8
.L10:
        movq    %r12, %rbx
        subq    %rax, %rbx
        movq    %rbx, %rcx
        shrq    %rcx
        movq    %rcx, %r9
        addq    %r9, %r9
        je      .L16
        pxor    %xmm0, %xmm0
        leaq    (%r14,%rax,8), %r8
        xorl    %edx, %edx
        .p2align 4,,10
        .p2align 3
.L12:
        .loc 1 64 0
        movq    %rdx, %rax
        addq    $1, %rdx
        salq    $4, %rax
        cmpq    %rcx, %rdx
        movdqa  %xmm0, (%r8,%rax)
^^^^^^^^^^^^^^^^^^^^^^^^^ At this point the memory MUST be 16-byte aligned,
but isn't if the input is 4-byte aligned +8 -> 12!=0 mod 16. This causes our
segfault.
        jb      .L12
        subq    %r9, %rsi
        cmpq    %r9, %rbx
        leaq    (%rdi,%r9,8), %rdi
        je      .L8

What's the plan going forward? align(AMD64) == 8?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20091010/903bdf7d/attachment.htm


More information about the MLton mailing list