cps & contification

Wed, 17 Jan 2001 12:55:50 -0500 (EST)

> An analysis A is safe for program p = (fm, T, N) if 
> 	*   A(fm) = Unknown
> 	**  For all nontail calls (f, g, j) in N,
> 		A(f) = Uncalled 
> 		or A(g) in {j, Unknown}
> 	*** For all tail calls (f, g) in T,
> 		A(f) = Uncalled
> 		or A(g) in {f, A(f), Unknown}
> 
> Define the function C: Analysis -> Analysis by
> 	C(A)(g) =
> 	     if g = fm
> 	       then fm
> 	     else if exists j such that forall (f, g, j') in N. j = j'
> 		                        and forall (f, g) in T. A(f) = j
> 	            then j
> 	          else g
> 
> Fact: C is monotone.
> 
> Let A' = lfp(C).  A' is the continuation based analysis.
> 
> Fact: A' is safe.

I don't think so.  Consider:

fun fm () = ... K1(f()) ... K2(f()) ...
fun f  () = ... g () ...
fun g  () = ... g () ...

Then T = {(f,g), (g, g)}, N = {(fm,f,K1), (fm,f,K2)}

   fm        f         g
Uncalled  Uncalled  Uncalled
   fm        f      Uncalled
   fm        f         f

But this violates * (because A'(fm) = fm <> Unknown) 
and ** (because A'(fm) <> Uncalled and A'(f) = f notin {K1, Unknown}
intersect {K2, Unknown}).

Also, the continuation based analysis should fail to determine any
functions are contifiable in the above set.

Changing C to the following restores safety:

 	C(A)(g) =
 	     if g = fm
 	       then Unknown
 	     else if exists j such that forall (f, g, j') in N. j = j'
 		                        and forall (f, g) in T. A(f) = j
 	            then j
 	          else Unknown

Let A' = lfp(C).  

Claim: A' is safe.
Proof:
*  A'(fm) = (lfp(C))(fm) 
          = if fm = fm     
              then Unknown
            else ...
          = Unknown

** Let (F, G, J) be an arbitrary element of N
   Suppose A'(F) = Uncalled.
    Then done.
   Suppose A'(F) <> Uncalled.
    Then A'(G) = (lfp(C))(G)
               = if G = fm
                   then Unknown
                 else if exists j such that forall (f, g, j') in N. j = j'
                                            and forall (f, g) in T. A'(f) = j
                        then j
                      else Unknown
               in {J, Unknown} 
               subset {J, Unknown}

*** Let (F, G) be an arbitrary element of T
    Suppose A'(F) = Uncalled.  
     Then done.
    Suppose A'(F) <> Uncalled.
     Then A'(G) = (lfp(C))(G)
                = if G = fm
                    then Unknown
                  else if exists j such that forall (f, g, j') in N. j = j'
                                             and forall (f, g) in T. A'(f) = j
                         then j
                       else Unknown
               in {A'(F), Unknown} 
               subset {F, A'(F), Unknown}

Now for the set of functions above, we have

   fm        f         g
Uncalled  Uncalled  Uncalled
Unknown   Unknown   Uncalled
Unknown   Unknown   Unknown

which corresponds to what we would expect for the continuation based
analysis.

> Now define a second analysis, A, based on A'.
> 
> 1. if A'(f) <> Unknown
>    then let A(f) = A'(f)
> 2. if A'(f) = Unknown and there exists a (g, f, j) in N
>    then let A(f) = Unknown
> 3. The only functions for which we have not defined A are only called in tail
>    position.  Define a directed graph G = (Node, Edge) where
> 	Node = Fun U {Root}
> 	Edge = {(f, g) | (f, g) in T and A'(g) = Unknown}
> 		U {(Root, f) | A'(f) <> Unknown}
> 		U {(Root, fm)}
>    Consider the dominator tree of G rooted at Root.
>    For any f in Fun for which we have not yet defined A(f), define A(f) to be
>    Unknown if the immediate dominator of f is Root.  Otherwise, define A(f) to be
>    the ancestor of f whose immediate dominator is Root.

Is this always well-defined?  

For the original A' and the set of functions above,
we assign A(f) = f, A(g) = f, A(fm) = fm (by rule 1) and all
elements of Fun have been defined.  But, A violates safety, for the
same reasons as above.

With the revised A' and the set of functions above,
we make no assignments by rule 1.  We assign A(f) = Unknown by rule 2.  We
construct the graph G with 
Node = {fm, f, g} and 
Edge = {(f, g), (g, g)} 
       U {}
       U {(Root, fm)}
which isn't a connected graph, so some nodes don't have an ancestor whose
immediate dominator is Root.

The problem seems to be that we really wanted an edge (Root, f).  I don't
quite understand the whole definition of Edge
Edge = {(f,g) | (f,g) in T and A'(g) = Unknown} 
       // this builds the tail call graph,
       //  but leaves out nodes we already figured out by Rule 1
       U {(Root, f) | A'(f) <> Unknown} 
       // this is the part I don't understand
       U {(Root, fm)}
       // ensure that Root always dominates fm

It seems to me that we want edges (Root, f) for everything that we know
must be Unknown.  Would the following definition make sense:

Edge = {(f,g) | (f,g) in T and A'(g) = Unknown}
       U {(Root, f) | A'(f) = Unknown and exists (g, f, j) in N}
       U {(Root, fm)}

i.e., roll rule 2 into the definition of G.