x86 backend design

Tue, 20 Jun 2000 18:59:02 -0400 (EDT)

I'm starting to hash out a little more regarding the x86 backend.  I've
looked into the "C" calling convention, it it looks dirt simple with gcc.
Probably the biggest issues will be "flushing" all of the registers with
unsaved data and restoring the stack and frame pointers (since it would
seem best to grab at least the stack pointer between "C" calls).  There
will also be some issues with changing the segment registers (since the
MLton stack lives in the data segment, and %esp defaults to addressing off
of the stack segment pointer), but nothing too difficult.

Also, I've been trying to outline a descent abstract view of x86 assembly.
So far, here's a bit of the design.  Comments and criticisms welcome,
particularly regarding the last bits.  There's currently a thread in
comp.compilers that's asking about x86 code generation and a good IL for
it, and I'm keeping some of the responses in mind.

datatype register = Real of ?? | Temp of ??
datatype scale = Two | Four | Eight
datatype label = ? probably just a string, taken from MachineOutput ?

datatype Operand = Register of register
                 | Const of int
	         | Label of label
                 | Address of {base : register option,
                               offset : register option,
                               scale : scale option,
                               disp : int}

transOperand : MachineOutput.operand * bool -> (Assem list * Operand)
==> transOperand translates a MachineOperand.operand into an Operand
    the flag (forceReg) determines if the Operand variant will be
     forced to a Register (via a move into a temporary register)
    in either case, the Assem list is a list of assembly that must be
     emitted before using the Operand to ensure that it is in the
     right place

Similarly, a function 
forceReg : Operand -> (Assem list * Operand)
forces the Operand into a register via the move instruction in the Assem
list.

Now translation of Contents, ArrayOffset, and Offset simply forceReg
when evaluation their sub-Operands and concat the assembly lists.

Similarly, at "final" code-generation output time, we can deal with
any leftover address to address operations via a forceReg and the
appropriate move after the op.

Downside: Memory address equality does not equal Operand equality.
          This means we can't easily look at addresses and decide
            that they are equal and choose to share a register.
Solution: Should MachineOutput.Operand.Register variant be initially
            translated into a Temp register (as opposed to an address,
            which was the initial idea)?
          This puts a heftier requirement on the register allocator,
            hence on analysis of liveness, etc.
          Another possibility is to have "smart" temps -- i.e.,
            everything in MachineOutput.Operand is either a constant,
            a label, or something that is associated with a memory
            location (stack offset, array offset, offset, global,
            register).  Certainly with register (and possibly with
            stack offset, array offset, and offset) operands, they
            can initially be translated as temps, with the spill
            location already set.  The complication that arises here
            is that all of these temporaries need to be flushed at
            limit checks (or at least when it is determined that control
            is entering the runtime).  But that's not so bad, every
            temp now knows exactly where to flush itself.

I think the basic trade-off is the following: If you leave Operands as
Address variants, you lose equality but you relieve some register
pressure.  If you move Addresses to temporaries, you gain equality but
increase register pressure.  Something like "smart" temps might let you
ride the middle ground an fall either way; in particular, it seems that
you would have better results if each of these smart temps had very short
lifetimes (length of a basic block?), so that those that are only used
once or twice would tend to fall towards becoming addresses and those that
are used more often tend to fall towards becoming registers.

I'm still thinking about the best way to actually represent abstract
x86 assembly.  Its unclear exactly how much of x86 idiosyncracies
should be/need to be present at this level, eg. floating point
registers and operations, accumulation operations, register
constraints/trashing on multiply, divide, shift.