[MLton] bytecode compiler mostly works

Stephen Weeks MLton@mlton.org
Sat, 16 Oct 2004 10:16:49 -0700


The bytecode compiler now passes all regressions and benchmarks.  It
does not yet self compile.  The benchmark numbers are below.  As you
can see, it is a lot slower than the native codege.  I don't have a
good feel for how much of that is due to the design of the bytecode,
how much is due to the implementation of the interpreter, and how much
is due to the naive translation of the Machine IL into bytecode.  I'm
also not sure how much can be gotten back by improving all those, but
I would hope we can get a factor of at least two on many benchmarks,
and up to ten on some.

MLton0 -- mlton -codegen native
MLton1 -- mlton -codegen bytecode
run time ratio
benchmark         MLton1
barnes-hut         19.33
boyer              31.63
checksum           69.53
count-graphs       71.45
DLXSimulator       21.11
fft                 2.82
fib                48.85
flat-array         89.79
hamlet             47.56
imp-for           109.42
knuth-bendix       58.72
lexgen             41.39
life               83.11
logic              36.80
mandelbrot         43.20
matrix-multiply    24.10
md5               147.82
merge              10.30
mlyacc             20.87
model-elimination  18.06
mpuz               75.61
nucleic            24.17
output1           157.29
peek              132.09
psdes-random      107.81
ratio-regions      44.16
ray                38.23
raytrace           40.63
simple             44.05
smith-normal-form   1.23
tailfib            89.10
tak                48.58
tensor            120.63
tsp                28.77
tyan               27.20
vector-concat      79.08
vector-rev         54.33
vliw               27.73
wc-input1          90.21
wc-scanStream      58.90
zebra              51.77
zern               26.64
size
benchmark            MLton0    MLton1
barnes-hut          142,013   195,492
boyer               164,084   251,203
checksum             77,424   111,363
count-graphs         91,104   142,083
DLXSimulator        170,781   257,856
fft                  89,448   133,843
fib                  73,784   111,587
flat-array           73,840   111,683
hamlet            1,288,717 2,256,100
imp-for              73,656   111,843
knuth-bendix        145,753   212,608
lexgen              244,442   355,381
life                 91,184   139,843
logic               135,248   221,091
mandelbrot           73,752   111,651
matrix-multiply      74,916   113,283
md5                 114,069   156,640
merge                75,396   114,147
mlyacc              548,422   837,333
model-elimination   674,872 1,091,699
mpuz                 76,416   116,611
nucleic             227,748   247,591
output1             123,995   169,194
peek                118,697   161,472
psdes-random         74,416   112,867
ratio-regions        99,512   152,451
ray                 224,117   323,696
raytrace            306,494   471,005
simple              263,188   366,595
smith-normal-form   224,133   295,488
tailfib              73,464   111,203
tak                  73,880   111,619
tensor              139,812   206,071
tsp                 118,405   166,820
tyan                177,245   268,000
vector-concat        75,080   113,891
vector-rev           74,308   112,515
vliw                431,902   707,165
wc-input1           144,921   203,900
wc-scanStream       148,525   210,172
zebra               161,113   260,896
zern                134,279   183,074
compile time
benchmark         MLton0 MLton1
barnes-hut          7.63   7.17
boyer               7.80   6.93
checksum            5.25   5.27
count-graphs        6.03   5.91
DLXSimulator        8.47   7.69
fft                 5.61   5.60
fib                 5.15   5.18
flat-array          5.19   5.17
hamlet             52.25  37.13
imp-for             5.26   5.27
knuth-bendix        6.79   6.43
lexgen              9.91   8.47
life                5.75   5.61
logic               7.24   6.40
mandelbrot          5.27   5.31
matrix-multiply     5.33   5.33
md5                 5.94   5.85
merge               5.23   5.26
mlyacc             24.32  17.39
model-elimination  24.72  18.51
mpuz                5.34   5.36
nucleic            14.76  13.87
output1             5.94   5.74
peek                5.83   5.73
psdes-random        5.29   5.31
ratio-regions       6.47   6.17
ray                 8.66   7.78
raytrace           13.28  11.13
simple             11.24   9.53
smith-normal-form   8.63   7.45
tailfib             5.18   5.20
tak                 5.21   5.20
tensor              8.01   7.64
tsp                 6.26   6.04
tyan                8.39   7.70
vector-concat       5.33   5.31
vector-rev          5.27   5.25
vliw               18.09  14.32
wc-input1           6.71   6.39
wc-scanStream       6.79   6.47
zebra               7.85   7.07
zern                5.92   5.87
run time
benchmark         MLton0  MLton1
barnes-hut         28.71  555.14
boyer              28.56  903.23
checksum           49.94 3472.11
count-graphs       23.21 1658.38
DLXSimulator       36.92  779.37
fft                73.48  207.03
fib                33.60 1641.35
flat-array         10.16  912.30
hamlet             19.40  922.71
imp-for            28.02 3065.55
knuth-bendix       24.70 1450.41
lexgen             39.45 1633.04
life                8.42  700.11
logic              23.99  882.60
mandelbrot         47.58 2055.22
matrix-multiply    12.04  290.12
md5                 8.06 1191.63
merge              43.38  446.68
mlyacc             31.48  656.98
model-elimination  69.70 1258.46
mpuz               22.41 1694.64
nucleic            22.43  542.24
output1             5.59  878.81
peek               20.45 2701.34
psdes-random       21.75 2345.20
ratio-regions      34.01 1501.81
ray                15.12  578.09
raytrace           25.08 1018.99
simple             24.38 1073.76
smith-normal-form  26.78   32.98
tailfib            23.48 2091.72
tak                12.45  604.60
tensor             27.14 3274.29
tsp                24.19  696.16
tyan               35.78  973.47
vector-concat      52.58 4158.14
vector-rev         43.96 2388.21
vliw               28.66  794.77
wc-input1          19.98 1802.17
wc-scanStream      21.23 1250.41
zebra              30.15 1560.94
zern               36.98  985.26