cps & contification

Stephen Weeks MLton@sourcelight.com
Wed, 17 Jan 2001 11:08:36 -0800 (PST)


> Changing C to the following restores safety:
> 
>  	C(A)(g) =
>  	     if g = fm
>  	       then Unknown
>  	     else if exists j such that forall (f, g, j') in N. j = j'
>  		                        and forall (f, g) in T. A(f) = j
>  	            then j
>  	          else Unknown

Agreed.  Thanks for the bug fixes, and the proof, which (correct me if I'm
wrong) showed that any fixed point of C is safe (i.e. it did not depend on least 
fixed point).

> > Now define a second analysis, A, based on A'.
...
> The problem seems to be that we really wanted an edge (Root, f).  I don't
> quite understand the whole definition of Edge
...
> It seems to me that we want edges (Root, f) for everything that we know
> must be Unknown.  Would the following definition make sense:
> 
> Edge = {(f,g) | (f,g) in T and A'(g) = Unknown}
>        U {(Root, f) | A'(f) = Unknown and exists (g, f, j) in N}
>        U {(Root, fm)}
> 
> i.e., roll rule 2 into the definition of G.

I agree with your points.  The intention was certainly that the graph be
connected with all nodes reachable from root.  The edges you add make sense.
The reason for the edges "{(Root, f) | A'(f) <> Unknown}" is to make the graph
connected when A' has already figured out stuff.  For example, consider the
program
	(	fm, 
		{(fm, f, K1), (fm, g, K2)},
		{(f, h), (g, h)})
For this program, 
	A'(fm) = Unknown
	A'(f) = K1
	A'(g) = K2
	A'(h) = Unknown

So we want the set of edges to be
	{(f, h), (g, h)}
	U {(Root, f), (Root, g)}
	U {}
	U {(Root, fm)}
So the immediate dominator of h is root, and we conclude A(h) = Unknown.

Here is my new proposal for the analysis, including your bugfixes and proof.  I
also cleaned up the definition of A and proof of its safety.

--------------------------------------------------------------------------------

Abstractly, a program consists of a main function and a set of tail and nontail
calls.  A tail call consists of a caller and a callee.  A nontail call consists
of a caller, callee, and jump.

	j in	Jump
	f in	Fun
		TailCall = Fun x Fun
		NonTailCall = Fun x Fun x Jump
	p in	Program = Fun x P(TailCall) x P (NonTailCall)

An analysis is a map from functions to the continuation that they return to.

		RCont = {Uncalled, Unknown} U Jump U Fun
	A in	Analysis = Fun -> RCont

Here is the intended meaning of A.

	A(f)		meaning
	--------	------------------------------------------------
	Uncalled	f is never called during evaluation of the program
	j in Jump	f always returns to continuation j
	g in Fun	f always returns to function g
	Unknown		f returns to many places

An analysis A is safe for program p = (fm, T, N) if 
	*   A(fm) = Unknown
	**  For all nontail calls (f, g, j) in N,
		A(f) = Uncalled 
		or A(g) in {j, Unknown}
	*** For all tail calls (f, g) in T,
		A(f) = Uncalled
		or A(g) in {f, A(f), Unknown}

Define the function C: Analysis -> Analysis by
	C(A)(g) =
	     if g = fm
	       then Unknown
	     else if exists j such that forall (f, g, j') in N. j = j'
		                        and forall (f, g) in T. A(f) = j
	            then j
	          else Unknown

Claim: if A = C(A) then A is safe.
Proof:
*  A(fm) = C(A)(fm) 
          = if fm = fm     
              then Unknown
            else ...
          = Unknown

** Let (F, G, J) be an arbitrary element of N
   Suppose A(F) = Uncalled.
    Then done.
   Suppose A(F) <> Uncalled.
    Then A(G) = C(A)(G)
               = if G = fm
                   then Unknown
                 else if exists j such that forall (f, g, j') in N. j = j'
                                            and forall (f, g) in T. A(f) = j
                        then j
                      else Unknown
               in {J, Unknown} 
               subset {J, Unknown}

*** Let (F, G) be an arbitrary element of T
    Suppose A(F) = Uncalled.  
     Then done.
    Suppose A(F) <> Uncalled.
     Then A(G) = C(A)(G)
                = if G = fm
                    then Unknown
                  else if exists j such that forall (f, g, j') in N. j = j'
                                             and forall (f, g) in T. A(f) = j
                         then j
                       else Unknown
               in {A(F), Unknown} 
               subset {F, A(F), Unknown}

Turn RCont into a lattice with the following order
	Uncalled <= x
	x <= x
	x <= Unknown
where x is an arbitrary element of Fun U Jump.

Extend <= pointwise to make A into a lattice.

Fact: C is monotone.

Let A' = lfp(C).  

Now define a directed graph G = (Node, Edge) where 
	Node = Fun U {Root}
	Edge = {(f, g) | (f, g) in T and A'(g) = Unknown}
		U {(Root, f) | A'(f) <> Unknown}
                U {(Root, f) | A'(f) = Unknown and exists (g, f, j) in N}
		U {(Root, fm)}

Claim: For all f in Fun, there is a path in G from Root to f.

Let D be the dominator tree of G rooted at Root.
For f in Fun, let parent(f) be the parent of f in D.

Define an analysis, A, based on A' and D as follows:
A(f) = 
	if A'(f) <> Unknown
		then A'(f)
		else if parent(f) = Root
			then Unknown
			else the ancestor g of f in D such that parent(g) = Root

Claim: A is safe.
Proof:
First note that since A' is a fixed-point of C, A' is safe.  Now, take each
clause of safety in turn.
* A(fm) = if A'(fm) <> Unknown
		then A'(fm)
		else if parent(fm) = Root
			then Unknown
			else the ancestor g of f in D such that parent(g) = Root
	= if parent(fm) = Root
		then Unknown
		else the ancestor g of f in D such that parent(g) = Root
	= Unknown
** Let (f, g, j) be an arbitrary element of N
	if A'(g) = Unknown
		then A(g) = Unknown because (Root, g) in Edge
			hence A(g) in {j, Unknown}
		else A(g) = A'(g)
			if A'(f) = Uncalled
				then A(f) = Uncalled
				else A'(g) = j  (because A' is safe)
					hence A(g) = A'(g)
					hence A(g) in {j, Unknown}
*** Let (f, g) be an arbitrary element of T
	if A'(g) = Unknown
		then (f, g) in Edge
			if parent(g) = Root
				then A(g) = Unknown in {f, A(f), Unknown}
			else A(g) = h, the ancestor of g in D with 
				parent(h) = Root.  Since (f, g) in Edge,
				h dominates f.
				if h = f
					then A(g) = h = f in {f, A(f), Unknown}
					else A'(f) = Unknown  (since h dom f)
						hence, A(f) = h
						hence A(g) = h = A(f) 
							in {f, A(f), Unknown}
	else A'(g) <> Unknown
		hence A'(f) <> Unknown
			and A'(f) = A'(g)  (because A' is safe)
		hence A(g) = A'(g) = A'(f) = A(f)
		hence A(g) in {f, A(f), Unknown}