cps & contification

Stephen Weeks MLton@sourcelight.com
Tue, 16 Jan 2001 17:12:03 -0800 (PST)


> I've almost got a dominator approach worked out.  I should be able to send mail
> with it later today.

Here's my next attempt.  It includes a summary of the previous mail, a
definition of the continuation based analysis, and a definition of a dominator
analysis that uses the results of the continuation based analysis.  I believe
that it labels as many functions as possible as contifiable.


Abstractly, a program consists of a main function and a set of tail and nontail
calls.  A tail call consists of a caller and a callee.  A nontail call consists
of a caller, callee, and jump.

	j in	Jump
	f in	Fun
		TailCall = Fun x Fun
		NonTailCall = Fun x Fun x Jump
	p in	Program = Fun x P(TailCall) x P (NonTailCall)

An analysis is a map from functions to the continuation that they return to.

		RCont = {Uncalled, Unknown} U Jump U Fun
	A in	Analysis = Fun -> RCont

Here is the intended meaning of A.

	A(f)		meaning
	--------	------------------------------------------------
	Uncalled	f is never called during evaluation of the program
	j in Jump	f always returns to continuation j
	g in Fun	f always returns to function g
	Unknown		f returns to many places

An analysis A is safe for program p = (fm, T, N) if 
	*   A(fm) = Unknown
	**  For all nontail calls (f, g, j) in N,
		A(f) = Uncalled 
		or A(g) in {j, Unknown}
	*** For all tail calls (f, g) in T,
		A(f) = Uncalled
		or A(g) in {f, A(f), Unknown}

Turn RCont into a lattice with the following order
	Uncalled <= x
	x <= x
	x <= Unknown
where x is an arbitrary element of Fun U Jump.

Extend <= pointwise to make A into a lattice.

Define the function C: Analysis -> Analysis by
	C(A)(g) =
	     if g = fm
	       then fm
	     else if exists j such that forall (f, g, j') in N. j = j'
		                        and forall (f, g) in T. A(f) = j
	            then j
	          else g

Fact: C is monotone.

Let A' = lfp(C).  A' is the continuation based analysis.

Fact: A' is safe.

Now define a second analysis, A, based on A'.

1. if A'(f) <> Unknown
   then let A(f) = A'(f)
2. if A'(f) = Unknown and there exists a (g, f, j) in N
   then let A(f) = Unknown
3. The only functions for which we have not defined A are only called in tail
   position.  Define a directed graph G = (Node, Edge) where
	Node = Fun U {Root}
	Edge = {(f, g) | (f, g) in T and A'(g) = Unknown}
		U {(Root, f) | A'(f) <> Unknown}
		U {(Root, fm)}
   Consider the dominator tree of G rooted at Root.
   For any f in Fun for which we have not yet defined A(f), define A(f) to be
   Unknown if the immediate dominator of f is Root.  Otherwise, define A(f) to be
   the ancestor of f whose immediate dominator is Root.

Claim: A is safe.
Proof:
* A'(fm) = Unknown and fm's immediate dominator is Root,
	hence A(fm) = Unknown.
** Let (f, g, j) be an arbitrary element of N
	if A'(g) = Unknown
		then A(g) = Unknown (by rule 2 in the def of A)
			hence A(g) in {j, Unknown}
		else A(g) = A'(g)
			and if A'(f) = Uncalled
				then A(f) = Uncalled
				else A'(g) = j  (because A' is safe)
					hence A(g) = A'(g)
					hence A(g) in {j, Unknown}
*** Let (f, g) be an arbitrary element of T
	if A'(g) = Unknown
		then (f, g) in Edge
			if A(g) = Unknown
				then A(g) in {f, A(f), Unknown}
			else A(g) = h, the ancestor of g whose immediate
				dominator is Root.  Since (f, g) in Edge, either
				h = f or (h <> f and h dominates f).
				if h = f
					then A(g) in {f, A(f), Unknown}
					else h <> f and h dominates f
						hence A'(f) = Unknown
						hence A(f) = h
						hence A(g) in {f, A(f), Unknown}
	else A'(g) <> Unknown
		hence A'(f) <> Unknown
			and A'(f) = A'(g)  (because A' is safe)
		hence A(g) = A'(g) = A'(f) = A(f)
		hence A(g) in {f, A(f), Unknown}