common-subexpression elimination (cse)

Stephen Weeks MLton@sourcelight.com
Tue, 24 Jul 2001 10:58:00 -0700


Yesterday, I implemented cse on CPS.  In addition to getting the usual sorts of
things like
	(i + 1) + (i + 1) 
	--> 
 	let val i' = i + 1 in i' + i' end
it also gets things like
	val a = Array_array n
	val b = Array_length a
	-->
	val a = Array_array n
	val b = n

Here are the benchmark results.  In them, "MLton" is with cse, "stable MLton" is
without.  In summary, no major improvements or problems, except on count-graphs,
which showed a bug (more on that in the next message).  It pretty much
universally shrunk code size.  Anyways, although it didn't help alot, I think it
is important in enabling other optimizations, like redundant test removal,
overflow detection elimination, and bounds check elimination. 

run time ratio
benchmark         stable MLton
barnes-hut                 1.0
checksum                   1.0
count-graphs              ~1.0
DLXSimulator               1.0
fft                        1.0
fib                        1.0
hamlet                     1.0
knuth-bendix               1.0
lexgen                     1.0
life                       1.1
logic                      1.0
mandelbrot                 0.9
matrix-multiply            1.0
md5                        1.0
merge                      1.0
mlyacc                     1.0
mpuz                       1.0
nucleic                    1.0
peek                       1.0
psdes-random               1.0
ratio-regions              1.0
ray                        1.0
raytrace                   1.0
simple                     1.0
smith-normal-form          1.0
tak                        1.0
tensor                     0.8
tsp                        1.0
vector-concat              1.2
vector-rev                 1.1
vliw                       1.0
wc-input1                  1.0
wc-scanStream              1.0
zebra                      1.0
zern                       1.0

run time
benchmark         MLton stable MLton
barnes-hut          5.2          5.2
checksum            4.4          4.6
count-graphs          *          6.0
DLXSimulator       12.5         12.9
fft                 9.8          9.9
fib                 4.4          4.4
hamlet              9.5          9.8
knuth-bendix        8.6          8.4
lexgen             13.7         13.8
life               10.6         11.2
logic              26.8         26.5
mandelbrot          9.7          9.0
matrix-multiply     6.2          6.2
md5                 4.9          5.2
merge              38.4         38.3
mlyacc             10.8         10.9
mpuz                7.0          6.9
nucleic             8.5          8.5
peek                4.7          4.9
psdes-random        5.7          5.8
ratio-regions       9.4          9.3
ray                 5.9          6.1
raytrace            6.5          6.4
simple              7.0          7.3
smith-normal-form   1.1          1.1
tak                11.4         11.4
tensor              9.7          8.1
tsp                12.4         12.7
vector-concat       7.5          8.7
vector-rev          3.4          3.7
vliw                7.1          7.0
wc-input1           3.0          3.0
wc-scanStream       5.4          5.2
zebra               3.2          3.2
zern               45.2         45.6

compile time
benchmark         MLton stable MLton
barnes-hut          2.7          2.7
checksum            0.8          0.8
count-graphs        1.9          1.9
DLXSimulator        4.5          4.4
fft                 1.4          1.4
fib                 0.7          0.7
hamlet             52.5         51.1
knuth-bendix        2.5          2.4
lexgen              5.8          5.5
life                1.4          1.4
logic               7.7          7.5
mandelbrot          0.8          0.8
matrix-multiply     0.8          0.8
md5                 2.8          2.3
merge               0.8          0.8
mlyacc             20.0         19.4
mpuz                1.0          1.0
nucleic             4.2          4.2
peek                1.2          1.1
psdes-random        0.8          0.8
ratio-regions       3.1          3.1
ray                 3.7          3.5
raytrace           10.5         10.1
simple              7.2          7.4
smith-normal-form   8.1          7.8
tak                 0.7          0.7
tensor              3.0          2.9
tsp                 1.9          1.8
vector-concat       0.8          0.8
vector-rev          0.8          0.8
vliw               12.9         12.5
wc-input1           1.7          1.6
wc-scanStream       1.8          1.7
zebra               5.1          4.9
zern                1.1          1.1

size
benchmark           MLton stable MLton
barnes-hut         63,904       64,000
checksum           23,692       23,708
count-graphs       43,468       43,476
DLXSimulator       93,252       93,844
fft                32,144       32,272
fib                23,492       23,508
hamlet            994,647      995,127
knuth-bendix       62,253       62,349
lexgen            130,084      130,220
life               37,100       37,548
logic             149,708      149,564
mandelbrot         23,580       23,580
matrix-multiply    24,028       24,044
md5                36,733       36,765
merge              24,644       24,644
mlyacc            427,916      428,324
mpuz               28,724       28,756
nucleic            60,724       60,724
peek               30,437       30,469
psdes-random       24,524       24,540
ratio-regions      58,724       60,324
ray                72,991       73,063
raytrace          181,116      181,876
simple            156,648      170,784
smith-normal-form 141,028      141,620
tak                23,524       23,540
tensor             66,307       66,691
tsp                39,357       39,485
vector-concat      24,196       24,212
vector-rev         24,044       24,140
vliw              272,064      272,824
wc-input1          40,965       40,933
wc-scanStream      43,613       43,565
zebra             111,789      111,789
zern               27,591       27,575