limit check insertion

Matthew Fluet Matthew Fluet <fluet@CS.Cornell.EDU>
Mon, 22 Oct 2001 19:44:31 -0400 (EDT)


> I assume that the last defining case of V should have been
>     V (l :: p, n) = V (p, n - m) if at(l) = Alloc(m)
>                                  if at(l) = LimitCheck(m)
> I.e.,  you  have to walk through the path, not just take the first element of
> the path, and what you called `li' was just `l'.  (Is the  first  part  wrong
> because all allocations and checks are at the start of a basic block?)

How about:

     V ([], n) = n
     V (l :: p, n) = V (p, n - m)     if at(l) = Alloc(m)
                     V (p, max(n, m)) if at(l) = LimitCheck(m)

Somewhere you would like to have an optimality/minimality condition, which
would somehow say that if you can have at(l) = LimitCheck(0) and still
have a sound function, then the limit check at l is extraneous.

> Isn't  it  incorrect  (needlessly  resrictive)  to  allow paths to start from
> arbitrary positions?  I.e., if I have some code that does a limit  check  and
> then  goes to another block which does some allocation, then I am safe, but I
> would fail your requirements since the path which starts  at  the  allocation
> would have a V which was negative.

Steve's definition of sound is:

Definition [sound]:
A function (ls, at, next) is sound if for all paths p = [ls, ..., ln] 
V (p, 0) >= 0.  

so a function's soundness is defined in terms of paths from the start
label.  (Better notation: for all paths p = [l0, ..., ln] where l0 = ls.)



> Theorem: If A is correct for f, then f is limit check correct.

This is where I lose you.  What's the definition of limit check correct?

> The job of limit check insertion is to take a function that doesn't have any
> limit checks, produce an equivalent sound function that does have limit checks
> Hopefully it does this by producing a safe annotation that can be checked by  
> another pass.

So, the annotations are just a witness to the fact that the inserted
limit-checks produce a sound function.


Overall comments: The one aspect I don't like is the fact that going from
a function without limit checks to one with limit checks seems to require
inserting new labels and changing the next function.  You need some
additional requirement that the control-flow isn't significantly altered.

You might also have simply have alloc : Label -> Nat and
limitCheck: Label -> Nat and define
     V ([], n) = n
     V (l :: p, n) = V (p, max(n, limitCheck(l)) - alloc(l))
That is, every block does a limit check (possibly zero) and then some
allocation.  

To get a limit check from ls, alloc, next, you need to produce a
limitCheck that makes the function f = (ls, limitCheck, alloc, next)
sound.

To get an optimal limit check, you need to produce a limitCheck that makes
the function f = (...) sound, while minimizing the pair (nzLC, totLC)
lexicographically where
nzLC = sum_{l in Label} 1 if limitCheck(l) > 0, 0 if limitCheck(l) = 0
totLC = sum_{l in label} limitCheck(l)
That is, we want the minimum number of non-zero limit checks (one big
limit check is cheaper than lots of small limit checks) and don't make any
limit check too big.

Of course, this doesn't take care of hoisting limit checks into loops.  Or
the "thread sound" criteria that requires that every infinite path has an
infinite number of non-zero (or simply forced) limit checks.

Furthermore, it seems like we could always hoist limit checks to the point
where we violate some space safety.  Do we need some additional criteria?: 

Definition [sound]:
A function (ls, at, next) is sound if for all paths p = [ls, ..., ln] 
V (p, 0) >= 0 and if next(ln) = {}, then V (p, 0) <= C for some fixed
constant C.

That is, when we're about to leave a function (i.e., do some transfer to a
place where we know nothing about their allocation), then we had better
not asked for too much extra memory.  The case I'm thinking about is a
conditional branch, down one path the program terminates without any
additional allocation, down the other the program requires a 1GB
allocation.  We don't want an out-of-memory error if the program was going
to terminate normally.

I also think that the next function needs to be carefully constructed.
Note, I don't think that it is the "normal" control-flow function.  If a
function does a limitCheck(m), then does a non-tail call, should the
continuation be allowed to think that m bytes are available?  I wouldn't
think so.  So, either there are a set of start labels (corresponding to
the "real" start label and then all continuation and handler labels) or
the start label is a dummy label with next(ls) being the real start label
and all continuation and handler labels.