cps & contification

Matthew Fluet Matthew Fluet <fluet@CS.Cornell.EDU>
Mon, 15 Jan 2001 17:47:59 -0500 (EST)


> Matthew, I finally sat down and carefully read your analysis.  Here's a summary
> of your previous emails, as I understand them.
> 
>   Label = Func U Jump
>   RCont = Label U {bot}
>   REnv = Label -> RCont
>   
>   For every l in Label, there is a partial order <=l on RCont defined by
>   	for all l' in Label, bot <=l l' 
>   	for all l' in Label, if l <> l', then l' <=l l
>   Under <=l, RCont is a flat lattice (with the obvious join).

	for all l' in Label, then l' <=l l  (partial order is reflexive)

Also, I wouldn't call RCont under <=l a flat lattice, because it does have
a top element: l.  But that may just be differences in background
terminology. 

>   Program analysis:
>   Anal[<fun fi (<xij>*) = ei>*] = fix (fn Gamma => V_i R[ei] Gamma'(fi))
>    where Gamma' = Gamma V {ft |--> ft}
>      and ft is the main function
>   (fixed point necessary because top level CPS functions are mutually recursive)
>   (initializing ft to "return to ft" marks the main function and ensures
>   that bot really corresponds to no calls)

    Anal[<fun fi (<xij>*) = ei>*] 
      = fix (fn Gamma => {ft |--> ft} V V_i R[ei] Gamma(fi))

This is closer to what is currently being done in the code, and should be
equivalent.

> Here's what I don't understand.  It seems to me that the function you're
> attempting to take a fixed-point of is not monotone.  Lets call that function F.
> 	F = fn Gamma => V_i R[ei] Gamma'(fi)
...
> That is, F is not monotone.

And you are correct.  I wish I had recognized this earlier, because it
explains a why I kept having problems getting the fixed point calculation
to work and ended up with the complicated version that you noted.

It still seems to me that a solution to
  Gamma = {ft |--> ft} V V_i R[ei] Gamma(fi)
would give the right contification information, but if F is not monotone,
then I don't know of any correct and efficient method of calculating it.
Note: a solution to this does exist; Gamma = {l |--> l} will always be a
solution.  Since there is an ordering on Gamma, then a least fixed point
does exist as well, although not necessarily unique.

What is currently in place will compute a fixed-point, but I guess not
necessarily the least fixed-point; although the evidence seems to imply
that it "small" enough to capture all the cases caught by the other
analyses.  But I've spent enought time tweaking with it to know that there
are probably some cases where it might miss something that the call-based
analysis would catch.

> The essential problem is that there are different orders on different components 
> of REnv but that the tail call rule propagates from one component to another.
> Hence, moving up in one component causes movement "sideways" in another.

True.  I've been puzzling why the existing continuation based analysis
wasn't suffering from the same problem, because the fixed-point function
there is very similar:

Label = Func
RCont = Jump U {bot} U {top} ; squashed powerset domain with obvious join
REnv = Label -> RCont        ; continuous function domain with pointwise join

Anal[<fun fi (<xij>*) = ei>*]
  = fix (fn Gamma => {ft |--> top} V V_i R[ei] Gamma(fi)) 

R[k (f <xi>*)]p = {f |--> k}
R[f <xi>*]p     = {f |--> p}

It finally hit me: in the new analysis, REnv is not a continuous function
domain, so Gamma(fi) is not a continuous operation; that clearly keeps F
from being a continuous function.  But in the old analysis, REnv was a
continuous function domain, so Gamma(fi) is a continuous operation. 

> I am not saying I think the analysis is wrong at all.  I am just saying I don't
> yet understand it.  I could probably get an operational understanding from
> reading the code carefully and then reconstruct a more POPLesque description,
> but I am hoping you can provide the latter.

I'll think about it some more; on the one hand, for any moderate sized
set of functions, it's obvious what answer you want for Gamma.  So it
doesn't seem like it should be hard to describe the algorithm for finding
it.  

In any event, it might not matter at all. Here are the results of the
benchmarks, which are somewhat disappointing:

                        none    both    new     new/both
barnes-hut              9.98    7.56    8.55    1.13
checksum                15.33   7.57    7.56    1.00
count-graphs            14.45   10.58   10.68   1.01
fft                     34.12   28.76   28.68   1.00
fib                     6.24    6.63    6.64    1.00
knuth-bendix            11.45   12.33   12.33   1.00
lexgen                  28.09   21.59   21.62   1.00
life                    46.20   36.31   35.95   0.99
logic                   35.83   35.31   35.31   1.00
mandelbrot              57.95   13.30   13.24   1.00
matrix-multiply         14.95   7.89    7.87    1.00
merge                   81.85   75.82   75.84   1.00
mlyacc                  20.25   13.42   13.25   0.99
mpuz                    64.94   26.36   26.35   1.00
nucleic                 12.92   11.24   11.26   1.00
ratio-regions           21.32   13.38   13.53   1.01
simple                  12.29   10.54   10.54   1.00
smith-normal-form       1.64    1.61    1.59    0.99
tak                     16.73   16.13   16.08   1.00
tensor                  53.49   10.77   11.66   1.08
tsp                     24.03   18.29   18.29   1.00
vliw                    16.19   10.94   11.02   1.01
wc                      5.48    3.85    3.87    1.01
zern                    30.81   12.38   10.38   0.84

I don't have any explaination for zern; there is only one new function
contifiable, and it's the same function that is contifiable in concat_1 in
practically every program.