CVS Commit

Matthew Fluet fluet@CS.Cornell.EDU
Mon, 17 Sep 2001 10:09:58 -0400 (EDT)


http://www.cs.cornell.edu/People/fluet/MLton/directed-graph.tgz

Added an implementation of the G. Ramalingam loop forest construction, as
described in "On Loops, Dominators, and Dominance Frontiers" (originally
in PLDI00; revised technical report at
http://www.research.ibm.com/people/r/rama/Papers/ibmtr21513.revised.ps). 

******************************************************************************

(This doesn't need to go into the CVS commit log.  Just some context.)

This was motivated by the x86-live-transfers module, which failed to carry
the right values around in registers.  For example, from tensor.sml
(ignoring overflow)

loopk_3:
	testl SI(256),SI(256)
	jz L_377
L_373:
	decl SI(256)
	incl SI(252)
	incl SI(248)
	jmp loopk_3
...
L_377:
	decl SI(244)
...

Thinking about what to carry in registers at loopk_3, the previous
heuristic was to see which values were used nearest in the future.  This
meant that SI(244) looked "better" than SI(248), and I might end up
carryingSI(244) in a register, but not SI(248).  With the loop forest, I
give SI(248) (and the other values live inside the loopk_3 loop) a higher
weight than those that are live outside the loop, and it now "does the
right thing".  This got a nice speed up on tensor.sml (although, it's
still lagging a ways behind the C codegen). 

I'm going to do a big commit of new x86 codegen stuff tomorrow after
running a round of benchmarks tonight.  (It should have run last night,
but I had bugs in the benchmark program.)

Hopefully, the loop forest stuff will also be applicable to the SSA IL and
loop optimizations.