SSA simplify passes

Matthew Fluet fluet@CS.Cornell.EDU
Mon, 7 Jan 2002 19:58:31 -0500 (EST)


Here are the rest of the benchmarks:

MLton0 -- mlton -new-flatten false -tuple-recon-elim 0
MLton1 -- mlton -new-flatten false -tuple-recon-elim 1
MLton2 -- mlton -new-flatten false -tuple-recon-elim 2
MLton3 -- mlton -new-flatten true -tuple-recon-elim 0
MLton4 -- mlton -new-flatten true -tuple-recon-elim 1
MLton5 -- mlton -new-flatten true -tuple-recon-elim 2
compile time
benchmark         MLton0 MLton1 MLton2 MLton3 MLton4 MLton5
barnes-hut          2.56   2.61   2.60   2.55   2.57   2.59
checksum            0.59   0.60   0.59   0.57   0.58   0.60
count-graphs        1.72   1.74   1.75   1.78   1.73   1.75
DLXSimulator        4.54   4.57   4.51   4.63   4.59   4.60
fft                 1.30   1.29   1.29   1.28   1.30   1.30
fib                 0.55   0.54   0.57   0.54   0.54   0.56
hamlet             56.31  55.36  55.67  53.72  53.70  53.74
imp-for             0.59   0.60   0.58   0.60   0.58   0.60
knuth-bendix        2.22   2.26   2.25   2.25   2.24   2.24
lexgen              5.72   5.72   5.73   5.75   5.74   5.77
life                1.34   1.32   1.34   1.32   1.32   1.32
logic               3.10   3.11   3.13   3.06   3.08   3.09
mandelbrot          0.58   0.61   0.61   0.59   0.57   0.60
matrix-multiply     0.68   0.67   0.69   0.69   0.69   0.65
md5                 1.21   1.26   1.26   1.24   1.24   1.26
merge               0.59   0.60   0.63   0.62   0.64   0.60
mlyacc             22.70  23.29  22.48  24.40  22.29  22.31
mpuz                0.79   0.80   0.80   0.80   0.84   0.83
nucleic             2.75   2.76   2.80   2.76   2.74   2.71
peek                0.99   1.00   0.99   0.98   1.01   1.02
psdes-random        0.62   0.63   0.61   0.63   0.62   0.63
ratio-regions       2.55   2.51   2.53   2.57   2.56   2.55
ray                 3.60   3.60   3.51   3.56   3.57   3.63
raytrace            9.09   9.07   9.07  10.55  10.54  10.51
simple              7.34   7.29   7.31   7.40   7.39   7.38
smith-normal-form   8.15   8.15   8.17   8.04   8.02   8.04
tailfib             0.55   0.57   0.55   0.55   0.56   0.53
tak                 0.55   0.56   0.55   0.58   0.58   0.56
tensor              3.10   3.11   3.08   3.08   3.10   3.08
tsp                 1.54   1.52   1.51   1.56   1.53   1.54
tyan                3.82   3.79   3.78   3.87   3.73   3.73
vector-concat       0.62   0.63   0.63   0.65   0.61   0.62
vector-rev          0.57   0.61   0.65   0.58   0.58   0.58
vliw               12.77  12.80  12.80  12.83  12.86  12.61
wc-input1           1.63   1.63   1.66   1.62   1.62   1.62
wc-scanStream       1.66   1.66   1.67   1.67   1.67   1.69
zebra               5.96   5.96   5.96   5.49   5.52   5.96
zern                1.07   1.06   1.06   1.09   1.08   1.09
run time
benchmark         MLton0 MLton1 MLton2 MLton3 MLton4 MLton5
barnes-hut          4.32   4.32   4.33   4.30   4.30   4.30
checksum            3.09   3.09   3.09   3.09   3.09   3.09
count-graphs        4.95   5.07   4.92   4.95   5.03   4.86
DLXSimulator       15.75  15.73  15.71  15.53  15.53  15.55
fft                 9.44   9.44   9.41   9.44   9.42   9.42
fib                 3.41   3.41   3.41   3.41   3.41   3.41
hamlet              8.20   8.16   8.29   7.22   7.22   7.36
imp-for             8.23   8.23   8.23   8.23   8.23   8.23
knuth-bendix        6.69   6.69   6.53   6.48   6.48   6.48
lexgen             10.79  10.79  10.60  10.81  10.80  10.81
life                6.83   6.87   6.53   6.82   6.84   6.88
logic              20.74  20.74  20.78  18.00  18.03  18.00
mandelbrot          6.20   6.20   6.20   6.20   6.20   6.20
matrix-multiply     3.92   3.94   3.92   3.89   3.94   3.94
md5                 2.03   2.03   2.03   2.03   2.03   2.03
merge              49.66  49.77  49.73  49.76  49.69  49.71
mlyacc              9.55   9.40  10.38   9.39   9.38   9.41
mpuz                4.57   4.57   4.57   4.54   4.54   4.54
nucleic             7.70   7.72   7.65   8.29   8.28   8.32
peek                3.26   3.25   3.25   3.25   3.25   3.25
psdes-random        3.36   3.36   3.36   3.36   3.36   3.36
ratio-regions       8.81   8.80   8.80   8.81   8.80   8.81
ray                 3.84   3.84   3.84   3.74   3.74   3.73
raytrace            4.77   4.80   4.80   4.97   4.95   4.94
simple              6.01   6.03   6.64   6.07   6.04   6.07
smith-normal-form   0.94   0.95   0.94   0.95   0.94   0.94
tailfib            15.47  15.48  15.48  15.47  15.48  15.47
tak                 8.77   8.77   8.77   8.87   8.87   8.87
tensor              5.82   5.82   5.82   5.82   5.82   5.82
tsp                 8.76   8.76   8.76   8.77   8.76   8.77
tyan               19.93  17.45  13.30  19.86  17.32  17.34
vector-concat       5.87   5.99   5.86   5.83   5.76   5.77
vector-rev          4.10   4.10   4.13   4.12   4.10   4.11
vliw                6.32   6.30   6.36   6.20   6.17   6.18
wc-input1           1.74   1.73   1.73   1.74   1.74   1.74
wc-scanStream       3.48   3.48   3.48   3.47   3.47   3.47
zebra               2.33   2.36   2.35   2.37   2.35   2.35
zern               35.22  35.31  35.34  35.40  35.25  35.46
run time ratio
benchmark         MLton1 MLton2 MLton3 MLton4 MLton5
barnes-hut          1.00   1.00   0.99   0.99   1.00
checksum            1.00   1.00   1.00   1.00   1.00
count-graphs        1.02   0.99   1.00   1.02   0.98
DLXSimulator        1.00   1.00   0.99   0.99   0.99
fft                 1.00   1.00   1.00   1.00   1.00
fib                 1.00   1.00   1.00   1.00   1.00
hamlet              1.00   1.01   0.88   0.88   0.90
imp-for             1.00   1.00   1.00   1.00   1.00
knuth-bendix        1.00   0.98   0.97   0.97   0.97
lexgen              1.00   0.98   1.00   1.00   1.00
life                1.01   0.96   1.00   1.00   1.01
logic               1.00   1.00   0.87   0.87   0.87
mandelbrot          1.00   1.00   1.00   1.00   1.00
matrix-multiply     1.00   1.00   0.99   1.01   1.00
md5                 1.00   1.00   1.00   1.00   1.00
merge               1.00   1.00   1.00   1.00   1.00
mlyacc              0.98   1.09   0.98   0.98   0.99
mpuz                1.00   1.00   0.99   0.99   0.99
nucleic             1.00   0.99   1.08   1.07   1.08
peek                1.00   1.00   1.00   1.00   1.00
psdes-random        1.00   1.00   1.00   1.00   1.00
ratio-regions       1.00   1.00   1.00   1.00   1.00
ray                 1.00   1.00   0.97   0.97   0.97
raytrace            1.00   1.01   1.04   1.04   1.04
simple              1.00   1.10   1.01   1.00   1.01
smith-normal-form   1.00   1.00   1.00   1.00   1.00
tailfib             1.00   1.00   1.00   1.00   1.00
tak                 1.00   1.00   1.01   1.01   1.01
tensor              1.00   1.00   1.00   1.00   1.00
tsp                 1.00   1.00   1.00   1.00   1.00
tyan                0.88   0.67   1.00   0.87   0.87
vector-concat       1.02   1.00   0.99   0.98   0.98
vector-rev          1.00   1.01   1.00   1.00   1.00
vliw                1.00   1.01   0.98   0.98   0.98
wc-input1           1.00   1.00   1.00   1.00   1.00
wc-scanStream       1.00   1.00   1.00   1.00   1.00
zebra               1.02   1.01   1.02   1.01   1.01
zern                1.00   1.00   1.01   1.00   1.01
size
benchmark            MLton0    MLton1    MLton2    MLton3    MLton4 MLton5
barnes-hut           69,200    69,200    69,216    69,312    69,312 69,328
checksum             21,000    21,000    21,000    21,000    21,000 21,000
count-graphs         44,256    44,256    44,040    44,064    44,064 44,064
DLXSimulator         99,296    99,296    99,264    97,904    97,904 97,904
fft                  32,484    32,484    32,484    32,484    32,484 32,484
fib                  21,000    21,000    21,000    21,000    21,000 21,000
hamlet            1,499,755 1,466,843 1,485,387 1,384,411 1,384,267 1,383,275
imp-for              20,992    20,992    20,992    20,992    20,992 20,992
knuth-bendix         65,529    65,529    65,505    65,185    65,185 65,185
lexgen              157,032   157,032   157,864   157,336   157,336 157,320
life                 40,976    40,976    40,880    40,344    40,320 40,320
logic                88,824    88,824    88,824    88,104    88,104 88,104
mandelbrot           20,960    20,960    20,960    20,960    20,960 20,960
matrix-multiply      21,552    21,552    21,552    21,552    21,552 21,552
md5                  31,481    31,481    31,481    31,481    31,481 31,481
merge                22,208    22,208    22,208    22,208    22,208 22,208
mlyacc              579,288   574,808   573,976   567,240   567,240 566,072
mpuz                 25,976    25,976    25,976    25,840    25,840 25,840
nucleic              63,168    63,168    62,640    62,200    62,200 61,536
peek                 31,025    31,025    31,025    29,953    29,953 29,953
psdes-random         21,968    21,968    21,968    21,968    21,968 21,968
ratio-regions        44,128    44,128    44,128    44,128    44,128 44,128
ray                  85,259    85,227    85,179    84,459    84,459 84,507
raytrace            204,824   204,712   203,800   298,328   298,216 298,200
simple              194,996   194,244   197,660   198,108   197,996 197,980
smith-normal-form   148,732   148,732   148,764   147,628   147,628 147,660
tailfib              20,672    20,672    20,672    20,672    20,672 20,672
tak                  21,120    21,120    21,120    21,120    21,120 21,120 
tensor               71,523    71,523    71,523    70,147    70,147 70,163
tsp                  37,065    37,065    37,081    37,065    37,065 37,081
tyan                 91,897    91,177    89,353    89,369    88,649 88,649
vector-concat        21,808    21,808    21,808    21,808    21,808 21,808
vector-rev           21,624    21,624    21,624    21,624    21,624 21,624
vliw                340,628   340,532   337,124   336,772   336,644 332,468
wc-input1            45,121    45,121    45,137    44,929    44,929 44,961
wc-scanStream        46,417    46,417    46,417    46,273    46,273 46,257
zebra               127,641   127,641   127,641   127,641   127,641 127,641
zern                 28,131    28,131    28,131    28,131    28,131 28,131

Not too bad.  hamlet and logic suggest that the new flattener is
marginally better, independent of the tuple reconstruction elimination.
On the other hand, raytrace suggests the opposite.  Although, this
probably makes sense -- raytrace uses floats everywhere, which are
expensive to move in and out of argument positions (even if it is a
memory-memory move, it's still bounced through the floating point stack);
it probably is better in those cases to pass a single pointer to a tuple
of floats.

No illumination on what's going on with tyan.

Overall, no major slowdowns and a couple of decent improvements.  I'm
leaning towards cleaning it up (i.e., drop those options an sticking with
the new flattener and tuple reconstruction elimination always on in the
shrinker) and checking it in.  The flattener might benefit from a little
more tweaking.