"root" of ChunkPerFunc chunk

Thu, 2 Nov 2000 13:36:33 -0500 (EST)

> > My impression was that he was worrying about the destination (not the source)
> > of the add still being alive in the handler.  I.e., if I do
> > 	x := y + z
> > then the spec says that the raise happens in the + (before the assignment) so
> > if anything in a handler is closed over x, it must see the old value.
> > If the input to the x86-codegen re-uses registers (so you really have to
> > consider assignments as assignments rather than binding forms) then you really
> > do have to take this into account, but other wise you don't.
> > Am I missing something?
> 
> No.  Although the source fragment you show is not exactly the problem, since
> right now assignments to references always happen to memory locations.  But
> essentially the same problem can happen with variable bindings that get turned
> into assignments in the machine IL.

Here's the example that I was working on:

fun f(0) = 0
  | f(1) = 1
  | f(x + 15 handle Overflow => if x mod 2 = 0 then 0 else 1)

In the absence of a check and the handler, we would want this to be turned
into a tail-recursive function and the machine IL probably looks something
like this (SP(0) is return address, SI(4) is argument):

f: (limit check)
   (returning branches on arg)
   RI(0) = Int_add(SI(4), 15)
   SI(4) = RI(0)
   GOTO f

and the x86-simplifier turns this into

f: (limit check)
   (returning branches on arg)
   addl 15, SI(4)
   jmp f

With the check and the handler, we get

f: (limit check)
   (returning branches on arg)
   RI(0) = Int_addCheck(SI(4), 15)
   SI(4) = RI(0)
   GOTO f

Overflow: branch on value of SI(4)
evenBranch: SI(4) = 0
            GOTO f
oddBranch: SI(4) = 1
           GOTO f

And everything works out without that worry about the destination of the
add check.  But, there isn't anything in the machine IL that wouldn't
allow it to emit the equivalent and simpler version:

f: (limit check)
   (returning branches on arg)
   SI(4) = Int_addCheck(SI(4), 15)
   GOTO f

Overflow: as above

given our intuitive understanding of the semantics of the addCheck.  But
here, we now need to worry about the destination of the add, because
parity will change even with the overflow result.

This also relates back to what optimizations can occur with checks.  For
example, the optimization done at the pseudo-assembly level which turns

RI(0) = Int_add(SI(4), 15)
SI(4) = RI(0)

=>

movl SI(4), RI(0)
addl 15, RI(0)
movl RI(0), SI(4)

=>

addl 15, SI(4)

is a fairly important one.  So, I was trying to decide if there are cases
where I could pull it off even with the check.  There sort of are: for
example, replace SI(4) with RI(4) above and there's no issue with the
check because RI(4) isn't live in the handler.  But, I've put enough
effort into the register allocator that it's fairly good about just doing
register renaming for moves with dead pseudo-regs as the source.  So,
looking at the assembly, you shouldn't be able to distinguish between the
optimization happening in the simplifier and in the allocator.  (That's
why all of the temps in the overflow checks go away.)  With SI(4), we have
all the issues discussed above.