[MLton-commit] r6868

Wesley Terpstra wesley at mlton.org
Wed Sep 17 21:28:27 PDT 2008


Matthew's patch to fix jump tables amd64 PIC output. His comments follow:

In amd64-generate-transfers.fun:1923, we instruct the register allocator to
save the state of registers. At 1760 and 1770, we instruct the register
allocator to load the saved state of the registers, allowing us to
compensate for the state of the registers at the point of the jump and the
state of the registers at the point of the real target. But, note that we
saved the state of the registers before the jmp instruction; that is, before
the register allocator gets a chance to do anything with the jump
instruction.

I think the order of the saving the register-allocator state was assuming
that the jcc and the jmp wouldn't demand any additional registers.

I believe the bug was the fact that we introduced two compensation blocks
for the 'default' label of a switch (implemented as a jump table), but
didn't increase the number of incoming edges count for the 'default' label
block.  This meant that one compensation block fell thru to the 'default'
label block without establishing a common register allocation (because it
believed that it was the one and only transfer to the 'default' label
block); then the other compensation block, seeing that the 'default' label
block had already been emitted, did establish a common register allocation. 
I'm testing a fix now.

Here's how I diagnosted it.  It was clear that the jump tables were causing
some memory corruption/segfault.  So, I introduced a
'-native-max-jump-tables <n>' command line option, to limit the number of
jump tables in the output.  We knew that 0 jump tables would not segfault,
while 967 jump tables would segfault.  (A quick grep of the output assembly
with unlimited jump tables showed that there were 967.) A simple 11 step
binary search showed that 581 jump tables would not segfault, but 582 jump
tables would segfault.  The 582nd jump table occurs in mlton.1028.s (with
-native-split 0), although the segfault occurs in mlton.915.s.  It was a
complete coincidence that the segfault occurred just after a jump table
transfer in mlton.915.s.  Looking at the assembler showed the following:

L_171066:
       movq 0x0(%r13),%r11
       movq 0xFFFFFFFFFFFFFFF8(%r11),%rax
       shrq $0x1,%rax
       subq $0x2566,%rax
       cmpq $0x68,%rax
       ja L_455658
       leaq jumpTable_581(%rip),%rbx
       jmp *(%rbx,%rax,8)
...
.p2align 0x4
L_455657:
L_171067:
       movq 0x10(%r11),%r13
       movq 0x8(%r11),%r10
       movq 0x0(%r11),%r9
       movq $0x4FBB,0x0(%r12)
       movq %r12,%r11
       addq $0x8,%r11
       movq %r9,0x0(%r11)
       movq %r10,0x8(%r11)
       movq %r13,0x10(%r11)
       addq $0x20,%r12
       xchgq %r11,%r15
       jmp L_171061
.p2align 0x4
L_455658:
       xchgq %r11,%r15
       jmp L_171067

There are two oddities.

First, there were no other jumps to L_455657 or L_171067.  The oddity here
is the fact that L_455658 should fall thru to L_171067 if it is the only
transfer to that block.

Second, the occurence of xchgq in L_455658.  The oddity here is the fact
that %r11 appears to be holding the pointer to the object being case
discriminated and it is used heavily in L_171067.

What sems to be happening is that L_455657 is the compensation block for the
'default' label that is used to fill in gaps in the jump table. However, in
this case, there were no gaps (thus, no occurrence of L_455657 in the jump
table), but it still got pushed onto the queue of blocks to be emitted. 
(This wasn't a problem in the old code, because the same compensation block
for the 'default' label was always used in both the range check and jump
table gaps (if any).) In fact, the L_455657 compensation block got processed
first, and thought it could fall thru to L_171067, without matching the
established register usage for L_171067 (believing that there were no other
block transfers to L_171067, it wouldn't matter whether or not it matched
the established register usage, since no one else would expect the
established register usage to be in effect).  Immediately afterwards, the
L_455658 compensation block got processed, and needed to jmp to L_171067,
and so shuffled registers to match the established register usage.

In the end, the fix seems to be as simple as lazily creating the
compensation block for the 'default' label that is used to fill in gaps in
the jump table, and to increase the jumpInfo of the default block in the
event that such a compensation block is created.



----------------------------------------------------------------------

U   mlton/trunk/mlton/codegen/amd64-codegen/amd64-generate-transfers.fun

----------------------------------------------------------------------

Modified: mlton/trunk/mlton/codegen/amd64-codegen/amd64-generate-transfers.fun
===================================================================
--- mlton/trunk/mlton/codegen/amd64-codegen/amd64-generate-transfers.fun	2008-09-18 04:21:01 UTC (rev 6867)
+++ mlton/trunk/mlton/codegen/amd64-codegen/amd64-generate-transfers.fun	2008-09-18 04:28:15 UTC (rev 6868)
@@ -1758,10 +1758,21 @@
                           val jump_table_label
                             = Label.newString "jumpTable"
 
+                          val idD = Directive.Id.new ()
+                          val defaultD = pushCompensationBlock
+                                         {label = default,
+                                          id = idD}
                           val idT = Directive.Id.new ()
-                          val defaultT = pushCompensationBlock
-                                         {label = default,
-                                          id = idT}
+                          val defaultT = 
+                             Promise.delay
+                             (fn () =>
+                              let
+                                 val _ = incNear(jumpInfo, default)
+                              in 
+                                 pushCompensationBlock
+                                 {label = default,
+                                  id = idT}
+                              end)
 
                           val rec filler 
                             = fn ([],_) => []
@@ -1776,7 +1787,8 @@
                                            (Immediate.label target')::
                                            (filler(cases', incFn j))
                                          end 
-                                    else (Immediate.label defaultT)::
+                                    else (Immediate.label 
+                                          (Promise.force defaultT))::
                                          (filler(cases, incFn j))
 
                           val jump_table = filler (cases, minK)
@@ -1881,6 +1893,9 @@
                                        remove_classes = ClassSet.empty,
                                        dead_memlocs = MemLocSet.singleton checkTemp',
                                        dead_classes = ClassSet.empty},
+                                      Assembly.instruction_jcc
+                                      {condition = Instruction.NZ,
+                                       target = Operand.label defaultC},
                                       Assembly.directive_saveregalloc
                                       {id = idC,
                                        live = MemLocSet.add
@@ -1888,9 +1903,6 @@
                                                (LiveSet.toMemLocSet default_live,
                                                 stackTop ()),
                                                frontier ())},
-                                      Assembly.instruction_jcc
-                                      {condition = Instruction.NZ,
-                                       target = Operand.label defaultC},
                                       Assembly.instruction_sral
                                       {oper = Instruction.SAR,
                                        count = Operand.immediate_int shift,
@@ -1922,19 +1934,26 @@
                            {src1 = indexTemp,
                             src2 = Operand.immediate_word rangeK,
                             size = Size.QUAD},
+                           Assembly.instruction_jcc
+                           {condition = Instruction.A,
+                            target = Operand.label defaultD},
                            Assembly.directive_saveregalloc
-                           {id = idT,
+                           {id = idD,
                             live = MemLocSet.add
                                    (MemLocSet.add
                                     (LiveSet.toMemLocSet live,
                                      stackTop ()),
                                     frontier ())},
-                           Assembly.instruction_jcc
-                           {condition = Instruction.A,
-                            target = Operand.label defaultT},
                            Assembly.instruction_jmp
                            {target = address,
                             absolute = true},
+                           Assembly.directive_saveregalloc
+                           {id = idT,
+                            live = MemLocSet.add
+                                   (MemLocSet.add
+                                    (LiveSet.toMemLocSet live,
+                                     stackTop ()),
+                                    frontier ())},
                            Assembly.directive_force
                            {commit_memlocs = MemLocSet.empty,
                             commit_classes = ClassSet.empty,




More information about the MLton-commit mailing list