[MLton-commit] r6677

Wesley Terpstra wesley at mlton.org
Wed Jul 30 01:18:17 PDT 2008


Output PIC assembly code. This requires accessing global symbols using 
RIP-relative addressing. RIP-relative addressing also results in about
2% smaller executables, so this code is used for non-libraries as well.

The patch locates all instances of:
	PDC Address: movq $name, %rax
	PDC Data Access: movq name, %rax
and changes them to:
	PIC Internal Address: leaq name(%rip),%rax
	PIC Internal Data Access: movq name(%rip),%rax

To effect this change, all cases where a mov was explicitly used with an
immediate label were changed to a lea of the label's memloc. Additionally,
the register-allocation was changed to load symbols relative to RIP. 

Unfortunately, it is not possible to write: movq foo(%rip,%rdi,8),%rax.
This means that the register allocation has to watch for the case where 
there is an index register and if so, break the mov down into:
	leaq foo(%rip),%rax
	movq (%rax,%rdi,8),%rax
Since this is slightly larger, it is only used for library output where 
PIC code is required. For the normal executable code-path, we output
	movq foo(,%rdi,8),%rax
as before.

Since the generated PIC code uses RIP-relative addressing, it is only able 
to access symbols compiled into the same relocatable text segment. Fortunately,
the amd64 codegen makes all system calls via the basis which is compiled in.
The basis in turn makes calls to other DSO text segments. Since the basis
uses -fPIC, this means calls will succeed.

There remains the case where the FFI is used to _import symbols from external 
libraries directly. This will be fixed in a future patch.


----------------------------------------------------------------------

U   mlton/trunk/mlton/codegen/amd64-codegen/amd64-allocate-registers.fun
U   mlton/trunk/mlton/codegen/amd64-codegen/amd64-codegen.fun
U   mlton/trunk/mlton/codegen/amd64-codegen/amd64-generate-transfers.fun
U   mlton/trunk/mlton/codegen/amd64-codegen/amd64-mlton.fun
U   mlton/trunk/mlton/codegen/amd64-codegen/amd64-pseudo.sig
U   mlton/trunk/mlton/codegen/amd64-codegen/amd64.fun
U   mlton/trunk/mlton/codegen/amd64-codegen/amd64.sig

----------------------------------------------------------------------

Modified: mlton/trunk/mlton/codegen/amd64-codegen/amd64-allocate-registers.fun
===================================================================
--- mlton/trunk/mlton/codegen/amd64-codegen/amd64-allocate-registers.fun	2008-07-30 08:03:07 UTC (rev 6676)
+++ mlton/trunk/mlton/codegen/amd64-codegen/amd64-allocate-registers.fun	2008-07-30 08:18:01 UTC (rev 6677)
@@ -3751,12 +3751,23 @@
            (let
               val MemLoc.U {immBase, memBase, immIndex, memIndex, scale, ...}
                 = MemLoc.destruct memloc
-
+                
+              (* Whenever possible, find labels with RIP-relative addressing.
+               * It's smaller code and faster even for position dependent code.
+               * However, RIP-relative addressing cannot be used with an index
+               * register. For PIC code we will thus break the access down
+               * into a leal for the symbol and a toRegister for the memIndex.
+               *)
+              
+              (* Combine all immediate offsets into one *)
               val disp 
                 = case (immBase, immIndex) of
                      (NONE, NONE) => Immediate.zero
                    | (SOME immBase, NONE) => immBase
-                   | (NONE, SOME immIndex) => immIndex
+                   | (NONE, SOME immIndex) 
+                   => (case Immediate.destruct immIndex of
+                          Immediate.Word _ => immIndex
+                        | _ => Error.bug "amd64AllocateRegisters.RegisterAllocation.toAddressMemLoc:indexLabel")
                    | (SOME immBase, SOME immIndex) 
                    => (case (Immediate.destruct immBase, Immediate.destruct immIndex) of
                           (Immediate.Label l1, Immediate.Word w2) => 
@@ -3764,36 +3775,75 @@
                         | (Immediate.LabelPlusWord (l1, w1), Immediate.Word w2) => 
                              Immediate.labelPlusWord (l1, WordX.add (w1, w2))
                         | _ => Error.bug "amd64AllocateRegisters.RegisterAllocation.toAddressMemLoc:disp")
-
-              val {register = register_base,
+              
+              (* The base register gets supplied by three distinct cases:
+               * 1 - memBase (which means that there is no label)
+               * 2 - RIP     (which means there is no index)
+               * 3 - lea     (which means this is a library)
+               * else nothing
+               *)
+              val {disp,
+                   register = register_base,
                    assembly = assembly_base,
                    registerAllocation}
-                = case memBase
-                    of NONE => {register = NONE,
-                                assembly = AppendList.empty,
-                                registerAllocation = registerAllocation}
-                     | SOME memBase
-                     => let
-                          val {register, assembly, registerAllocation}
-                            = toRegisterMemLoc 
-                              {memloc = memBase,
-                               info = info,
-                               size = MemLoc.size memBase,
-                               move = true,
-                               supports 
-                               = case memIndex
-                                   of NONE => supports
-                                    | SOME memIndex
-                                    => (Operand.memloc memIndex)::
-                                       supports,
-                               saves = saves,
-                               force = Register.baseRegisters,
+               = case (Immediate.destruct disp, memBase, memIndex) of
+                    (Immediate.Word _, NONE, _)
+                  =>  {disp = SOME disp,
+                       register = NONE,
+                       assembly = AppendList.empty,
+                       registerAllocation = registerAllocation}
+                  | (Immediate.Word _, SOME memBase, _) (* no label, no rip *)
+                  => let
+                        val {register, assembly, registerAllocation}
+                          = toRegisterMemLoc 
+                            {memloc = memBase,
+                             info = info,
+                             size = MemLoc.size memBase,
+                             move = true,
+                             supports 
+                             = case memIndex
+                                 of NONE => supports
+                                  | SOME memIndex
+                                 => (Operand.memloc memIndex)::
+                                     supports,
+                             saves = saves,
+                             force = Register.baseRegisters,
+                             registerAllocation = registerAllocation}
+                     in
+                       {disp = SOME disp,
+                        register = SOME register,
+                        assembly = assembly,
+                        registerAllocation = registerAllocation}
+                     end
+                  | (_, SOME _, _) (* label & memBase? bad input *)
+                  => Error.bug "amd64AllocateRegisters.RegisterAllocation.toAddressMemLoc:base*2"
+                  | (_, NONE, NONE) (* no index => safe to use RIP-relative *)
+                  => {disp = SOME disp,
+                      register = SOME Register.rip,
+                      assembly = AppendList.empty,
+                      registerAllocation = registerAllocation}
+                  | (_, NONE, SOME memIndex) (* label + index => use lea if library *)
+                  => if !Control.format <> Control.Library
+                        then {disp = SOME disp,
+                              register = NONE,
+                              assembly = AppendList.empty,
+                              registerAllocation = registerAllocation}
+                     else let
+                             val {register, assembly, registerAllocation}
+                               = toRegisterImmediate
+                                 {immediate = disp,
+                                  info = info,
+                                  size = MemLoc.size memIndex,
+                                  supports = Operand.memloc memIndex :: supports,
+                                  saves = saves,
+                                  force = Register.baseRegisters,
+                                  registerAllocation = registerAllocation}
+                          in
+                             { disp = NONE,
+                               register = SOME register,
+                               assembly = assembly,
                                registerAllocation = registerAllocation}
-                        in
-                          {register = SOME register,
-                           assembly = assembly,
-                           registerAllocation = registerAllocation}
-                        end
+                          end
 
               val {register = register_index,
                    assembly = assembly_index,
@@ -3814,6 +3864,11 @@
                                saves 
                                = case (memBase, register_base)
                                    of (NONE, NONE) => saves
+                                    | (NONE, SOME register_base)
+                                    => if register_base = Register.rip
+                                          then saves
+                                       else Operand.register register_base ::
+                                            saves
                                     | (SOME memBase, SOME register_base)
                                     => (Operand.memloc memBase)::
                                        (Operand.register register_base)::
@@ -3827,7 +3882,7 @@
                            registerAllocation = registerAllocation}
                         end
             in
-              {address = Address.T {disp = SOME disp,
+              {address = Address.T {disp = disp,
                                     base = register_base,
                                     index = register_index,
                                     scale = case memIndex
@@ -4011,15 +4066,26 @@
                               force = force,
                               registerAllocation = registerAllocation}
             val _ = Int.dec depth
+            val instruction
+              = case Immediate.destruct immediate of
+                   Immediate.Word _ =>
+                      Assembly.instruction_mov 
+                      {dst = Operand.Register final_register,
+                       src = Operand.Immediate immediate,
+                       size = size}
+                 | _ =>
+                      Assembly.instruction_lea
+                      {dst = Operand.Register final_register,
+                       src = Operand.Address
+                              (Address.T { disp = SOME immediate,
+                                           base = SOME Register.rip,
+                                           index = NONE, scale = NONE }),
+                       size = size}
           in
             {register = final_register,
              assembly = AppendList.appends
                         [assembly,
-                         AppendList.single
-                         (Assembly.instruction_mov
-                          {dst = Operand.Register final_register,
-                           src = Operand.Immediate immediate,
-                           size = size})],
+                         AppendList.single instruction],
              registerAllocation = registerAllocation}
           end
           handle Spill 

Modified: mlton/trunk/mlton/codegen/amd64-codegen/amd64-codegen.fun
===================================================================
--- mlton/trunk/mlton/codegen/amd64-codegen/amd64-codegen.fun	2008-07-30 08:03:07 UTC (rev 6676)
+++ mlton/trunk/mlton/codegen/amd64-codegen/amd64-codegen.fun	2008-07-30 08:18:01 UTC (rev 6677)
@@ -238,7 +238,7 @@
                   amd64.Assembly.instruction_mov
                   {src = (amd64.Operand.address o amd64.Address.T)
                          {disp = SOME (amd64.Immediate.label amd64MLton.c_stackP),
-                          base = NONE, index = NONE, scale = NONE},
+                          base = SOME amd64.Register.rip, index = NONE, scale = NONE},
                    dst = amd64.Operand.register amd64.Register.rbx,
                    size = amd64.Size.QUAD},
                   amd64.Assembly.instruction_mov
@@ -252,7 +252,7 @@
                   {src = amd64.Operand.register amd64.Register.rsp,
                    dst = (amd64.Operand.address o amd64.Address.T)
                          {disp = SOME (amd64.Immediate.label amd64MLton.c_stackP),
-                          base = NONE, index = NONE, scale = NONE},
+                          base = SOME amd64.Register.rip, index = NONE, scale = NONE},
                    size = amd64.Size.QUAD},
                   amd64.Assembly.instruction_mov
                   {src = (amd64.Operand.address o amd64.Address.T)
@@ -261,7 +261,7 @@
                                   Bytes.toInt 
                                   (Machine.Runtime.GCField.offset
                                    Machine.Runtime.GCField.StackTop)),
-                          base = NONE, index = NONE, scale = NONE},
+                          base = SOME amd64.Register.rip, index = NONE, scale = NONE},
                    dst = amd64.Operand.register stackTopReg,
                    size = amd64.Size.QUAD},
                   amd64.Assembly.instruction_mov
@@ -271,7 +271,7 @@
                                   Bytes.toInt 
                                   (Machine.Runtime.GCField.offset
                                    Machine.Runtime.GCField.Frontier)),
-                          base = NONE, index = NONE, scale = NONE},
+                          base = SOME amd64.Register.rip, index = NONE, scale = NONE},
                    dst = amd64.Operand.register frontierReg,
                    size = amd64.Size.QUAD},
                   amd64.Assembly.instruction_jmp
@@ -284,7 +284,7 @@
                   amd64.Assembly.instruction_mov
                   {src = (amd64.Operand.address o amd64.Address.T)
                          {disp = SOME (amd64.Immediate.label amd64MLton.c_stackP),
-                          base = NONE, index = NONE, scale = NONE},
+                          base = SOME amd64.Register.rip, index = NONE, scale = NONE},
                    dst = amd64.Operand.register amd64.Register.rsp,
                    size = amd64.Size.QUAD},
                   amd64.Assembly.instruction_mov
@@ -298,7 +298,7 @@
                   {src = amd64.Operand.register amd64.Register.rbx,
                    dst = (amd64.Operand.address o amd64.Address.T)
                          {disp = SOME (amd64.Immediate.label amd64MLton.c_stackP),
-                          base = NONE, index = NONE, scale = NONE},
+                          base = SOME amd64.Register.rip, index = NONE, scale = NONE},
                    size = amd64.Size.QUAD},
                   amd64.Assembly.instruction_mov
                   {src = (amd64.Operand.address o amd64.Address.T)

Modified: mlton/trunk/mlton/codegen/amd64-codegen/amd64-generate-transfers.fun
===================================================================
--- mlton/trunk/mlton/codegen/amd64-codegen/amd64-generate-transfers.fun	2008-07-30 08:03:07 UTC (rev 6676)
+++ mlton/trunk/mlton/codegen/amd64-codegen/amd64-generate-transfers.fun	2008-07-30 08:18:01 UTC (rev 6677)
@@ -1075,9 +1075,9 @@
                                  src = bytes,
                                  size = pointerSize},
                                 (* *(stackTopTemp - WORD_SIZE) = return *)
-                                amd64.Assembly.instruction_mov
+                                amd64.Assembly.instruction_lea
                                 {dst = stackTopTempMinusWordDeref,
-                                 src = Operand.immediate_label return,
+                                 src = Operand.memloc_label return,
                                  size = pointerSize},
                                 amd64.Assembly.directive_force
                                 {commit_memlocs = MemLocSet.singleton stackTopTempMinusWordDeref',
@@ -1100,9 +1100,9 @@
                                  src = bytes, 
                                  size = pointerSize},
                                 (* *(stackTop - WORD_SIZE) = return *)
-                                amd64.Assembly.instruction_mov
+                                amd64.Assembly.instruction_lea
                                 {dst = stackTopMinusWordDeref,
-                                 src = Operand.immediate_label return,
+                                 src = Operand.memloc_label return,
                                  size = pointerSize},
                                 amd64.Assembly.directive_force
                                 {commit_memlocs = MemLocSet.singleton stackTopMinusWordDeref',
@@ -1423,9 +1423,9 @@
                                               src = bytes,
                                               size = pointerSize},
                                              (* *(stackTopTemp - WORD_SIZE) = return *)
-                                             amd64.Assembly.instruction_mov
+                                             amd64.Assembly.instruction_lea
                                              {dst = stackTopTempMinusWordDeref,
-                                              src = Operand.immediate_label return,
+                                              src = Operand.memloc_label return,
                                               size = pointerSize},
                                              amd64.Assembly.directive_force
                                              {commit_memlocs = MemLocSet.singleton stackTopTempMinusWordDeref',
@@ -1448,9 +1448,9 @@
                                               src = bytes, 
                                               size = pointerSize},
                                              (* *(stackTop - WORD_SIZE) = return *)
-                                             amd64.Assembly.instruction_mov
+                                             amd64.Assembly.instruction_lea
                                              {dst = stackTopMinusWordDeref,
-                                              src = Operand.immediate_label return,
+                                              src = Operand.memloc_label return,
                                               size = pointerSize},
                                              amd64.Assembly.directive_force
                                              {commit_memlocs = MemLocSet.singleton stackTopMinusWordDeref',

Modified: mlton/trunk/mlton/codegen/amd64-codegen/amd64-mlton.fun
===================================================================
--- mlton/trunk/mlton/codegen/amd64-codegen/amd64-mlton.fun	2008-07-30 08:03:07 UTC (rev 6676)
+++ mlton/trunk/mlton/codegen/amd64-codegen/amd64-mlton.fun	2008-07-30 08:18:01 UTC (rev 6677)
@@ -668,9 +668,9 @@
                    [Block.mkBlock'
                     {entry = NONE,
                      statements =
-                     [Assembly.instruction_mov
+                     [Assembly.instruction_lea
                       {dst = dst,
-                       src = Operand.immediate_label (Label.fromString name),
+                       src = Operand.memloc_label (Label.fromString name),
                        size = dstsize}],
                      transfer = NONE}]
                 end

Modified: mlton/trunk/mlton/codegen/amd64-codegen/amd64-pseudo.sig
===================================================================
--- mlton/trunk/mlton/codegen/amd64-codegen/amd64-pseudo.sig	2008-07-30 08:03:07 UTC (rev 6676)
+++ mlton/trunk/mlton/codegen/amd64-codegen/amd64-pseudo.sig	2008-07-30 08:18:01 UTC (rev 6677)
@@ -136,6 +136,7 @@
         val label : Label.t -> t
         val deLabel : t -> Label.t option
         val memloc : MemLoc.t -> t
+        val memloc_label: Label.t -> t
         val deMemloc : t -> MemLoc.t option
 
         val size : t -> Size.t option

Modified: mlton/trunk/mlton/codegen/amd64-codegen/amd64.fun
===================================================================
--- mlton/trunk/mlton/codegen/amd64-codegen/amd64.fun	2008-07-30 08:03:07 UTC (rev 6676)
+++ mlton/trunk/mlton/codegen/amd64-codegen/amd64.fun	2008-07-30 08:18:01 UTC (rev 6677)
@@ -127,7 +127,7 @@
 
       datatype reg
         = RAX | RBX | RCX | RDX | RDI | RSI | RBP | RSP
-        | R8 | R9 | R10 | R11 | R12 | R13 | R14 | R15
+        | R8 | R9 | R10 | R11 | R12 | R13 | R14 | R15 | RIP
       val allReg = [RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP,
                     R8, R9, R10, R11, R12, R13, R14, R15]
 
@@ -197,6 +197,7 @@
                | R13 => doit3 "r13"
                | R14 => doit3 "r14"
                | R15 => doit3 "r15"
+               | RIP => doit3 "rip"
           end
       val toString = Layout.toString o layout
 
@@ -235,6 +236,7 @@
       val r14w = T {reg = R14, part = X}
       val r15 = T {reg = R15, part = R}
       val r15w = T {reg = R15, part = X}
+      val rip = T {reg = RIP, part = R}
 
       local
          fun make part =
@@ -1352,6 +1354,10 @@
            | _ => NONE
       val address = Address
       val memloc = MemLoc
+      fun memloc_label l = 
+         memloc (MemLoc.makeContents { base = Immediate.label l,
+                                       size = Size.QUAD,
+                                       class = MemLoc.Class.Code })
       val deMemloc 
         = fn MemLoc x => SOME x
            | _ => NONE

Modified: mlton/trunk/mlton/codegen/amd64-codegen/amd64.sig
===================================================================
--- mlton/trunk/mlton/codegen/amd64-codegen/amd64.sig	2008-07-30 08:03:07 UTC (rev 6676)
+++ mlton/trunk/mlton/codegen/amd64-codegen/amd64.sig	2008-07-30 08:18:01 UTC (rev 6677)
@@ -62,7 +62,7 @@
       sig
         datatype reg
           = RAX | RBX | RCX | RDX | RDI | RSI | RBP | RSP
-          | R8 | R9 | R10 | R11 | R12 | R13 | R14 | R15
+          | R8 | R9 | R10 | R11 | R12 | R13 | R14 | R15 | RIP
         val allReg : reg list
 
         datatype part
@@ -113,6 +113,7 @@
         val r14w : t
         val r15 : t
         val r15w : t
+        val rip : t
 
         val registers : Size.t -> t list
         val baseRegisters : t list
@@ -349,6 +350,7 @@
         val deLabel : t -> Label.t option
         val address : Address.t -> t
         val memloc : MemLoc.t -> t
+        val memloc_label : Label.t -> t
         val deMemloc : t -> MemLoc.t option
 
         val size : t -> Size.t option




More information about the MLton-commit mailing list