limit check insertion

Mon, 10 Dec 2001 15:44:26 -0500 (EST)

A random thought that occured to me when I added in the Runtime transfer
was that we might be able to use that for limit check insertion and
optimization.  What I had in mind was adding a new Prim.t for
GC_limitCheck; after deciding on representations, we could insert blocks
with Runtime {prim = GC_limitCheck, ...} transfers before every block;
now, optimize to your heart's content, check the correctness of the final
program, and translate into machine.

Disadvantages: initially, lots of little blocks with nothing but
                limitChecks
               it seems somewhat counter-intuitive to think about
                doing a limit check at the end of a block, rather
                than at the beginning

Advantages: these blocks provide a natural place for computing array
             bytes needed.  This would have the effect of splitting 
             a block that had both tuple allocation and array allocations
             into multiple blocks
            doing limit check insertion and optimization all in SSA
             feels like it could be a win; we know how to write good
             optimizations there and lots of infrastructure

The other difficulty might be in efficiently handling array and intInf's.
In particular, we need to reason about expressions, rather than just
integers.  For example, if we start with

L_S (n:int)
 a1 = Array_array(int) n
 a2 = Array_array(int) n
 t = (a1,a2)
 t

(I know, unrealistic, because all of the array initialization stuff would
probably happen between there, but for the sake of argument.)

Initially, we translate it to:

L_S (n:int)
 a1_size = 8 + 4 * n
 L_S1 (GC_limitCheck (a1_size))
L_S1 ()
 a1 = Array_array(int) n
 a2_size = 8 + 4 * n
 L_S2 (GC_limitCheck (a2_size))
L_S2 ()
 a2 = Array_array(int) n
 L_S3 (GC_limitCheck (12))
L_S3
 t = (a1,a2)
 t

It would be nice to reduce this to

L_S (n:int)
 size = 2 * (8 + 4 * n) + 12
 L_S1 (GC_limitCheck(size))
L_S1 ()
 a1 = Array_array(int) n
 a2 = Array_array(int) n
 t = (a1,a2)
 t

What makes this complicated is that size needs to be computed in SSA
(i.e., lots of little sub-computations), and then verifying that the
computed size is indeed sufficient for the allocations in L_S1.
Also, as Steve pointed out earlier, making sure that size doesn't overflow
on large arrays.