-native-optimize 2 bug

Matthew Fluet fluet@cs.cornell.edu
Thu, 7 Feb 2002 22:16:07 -0500 (EST)


Just to check my previous claim, I ran the benchmarks with
-native-optimize {0,1,2}.  As I thought, there is virtually no difference
between 1 and 2.  Disturbingly, there are lots of problems with
-native-optimize 0.  I haven't looked into them, but I will in the not too
distant future.  While it is clear that doing some x86 optimizations are a
win, I don't like being dependent on optimizations for correctness.

MLton0 -- mlton -native-optimize 0
MLton1 -- mlton -native-optimize 1
MLton2 -- mlton -native-optimize 2
compile time
benchmark         MLton0 MLton1 MLton2
barnes-hut          2.22   2.15   2.23
checksum            0.61   0.63   0.62
count-graphs        1.63   1.58   1.64
DLXSimulator        4.20   3.93   4.15
fft                 1.22   1.19   1.24
fib                 0.59   0.57   0.59
hamlet             45.56  40.75  42.82
imp-for             0.62   0.64   0.62
knuth-bendix        2.05   1.89   1.97
lexgen              5.19   4.81   5.05
life                1.26   1.18   1.21
logic               2.58   2.49   2.62
mandelbrot          0.62   0.63   0.67
matrix-multiply     0.71   0.71   0.72
md5                 1.21   1.13   1.17
merge               0.66   0.62   0.64
mlyacc             19.34  17.56  18.51
mpuz                0.83   0.83   0.84
nucleic             2.36   2.33   2.33
peek                0.96   0.97   0.97
psdes-random        0.67   0.65   0.66
ratio-regions       2.34   2.16   2.21
ray                 3.18   2.91   3.16
raytrace            9.38   8.89   9.85
simple              6.48   6.05   6.53
smith-normal-form   7.20   7.16   7.28
tailfib             0.60   0.60   0.59
tak                 0.60   0.58   0.60
tensor              2.77   2.66   3.12
tsp                 1.40   1.37   1.41
tyan                3.58   3.32   3.49
vector-concat       0.67   0.66   0.66
vector-rev          0.64   0.63   0.63
vliw               11.10  10.16  10.85
wc-input1           1.61   1.49   1.57
wc-scanStream       1.66   1.54   1.61
zebra               5.59   5.07   5.46
zern                1.10   1.01   1.06
run time
benchmark         MLton0 MLton1 MLton2
barnes-hut             *   3.77   3.78
checksum            4.04   3.18   3.18
count-graphs        4.98   3.76   3.82
DLXSimulator           *  14.71  14.76
fft                10.03   8.08   8.13
fib                 3.91   3.37   3.37
hamlet                 *   7.04   7.05
imp-for            12.78   7.16   7.16
knuth-bendix           *   5.67   5.67
lexgen                 *   9.19   9.21
life                8.51   6.32   6.30
logic               0.01  17.75  17.75
mandelbrot          9.31   6.06   6.06
matrix-multiply     4.48   2.77   2.77
md5                 2.46   1.76   1.77
merge              49.48  48.05  48.13
mlyacc                 *   8.79   8.82
mpuz                6.67   4.25   4.25
nucleic             8.39   8.02   8.03
peek                   *   0.82   0.82
psdes-random        4.34   3.20   3.29
ratio-regions      10.93   8.39   8.41
ray                    *   3.56   3.52
raytrace            0.01   4.88   4.89
simple                 *   5.85   5.85
smith-normal-form      *   0.66   0.66
tailfib            20.37  10.95  10.95
tak                 9.88   7.74   7.74
tensor                 *   3.68   3.68
tsp                 9.33   7.51   7.51
tyan                   *  16.07  16.05
vector-concat       6.17   2.24   2.87
vector-rev          4.14   4.20   4.18
vliw                   *   5.65   5.65
wc-input1              *   1.93   1.92
wc-scanStream          *   2.12   2.12
zebra               2.86   1.80   1.90
zern               39.38  33.42  33.29
run time ratio
benchmark          MLton1  MLton2
barnes-hut          ~1.00   ~1.00
checksum             0.79    0.79
count-graphs         0.76    0.77
DLXSimulator        ~1.00   ~1.00
fft                  0.81    0.81
fib                  0.86    0.86
hamlet              ~1.00   ~1.00
imp-for              0.56    0.56
knuth-bendix        ~1.00   ~1.00
lexgen              ~1.00   ~1.00
life                 0.74    0.74
logic             1604.26 1603.81
mandelbrot           0.65    0.65
matrix-multiply      0.62    0.62
md5                  0.72    0.72
merge                0.97    0.97
mlyacc              ~1.00   ~1.00
mpuz                 0.64    0.64
nucleic              0.96    0.96
peek                ~1.00   ~1.00
psdes-random         0.74    0.76
ratio-regions        0.77    0.77
ray                 ~1.00   ~1.00
raytrace           432.61  433.63
simple              ~1.00   ~1.00
smith-normal-form   ~1.00   ~1.00
tailfib              0.54    0.54
tak                  0.78    0.78
tensor              ~1.00   ~1.00
tsp                  0.81    0.81
tyan                ~1.00   ~1.00
vector-concat        0.36    0.47
vector-rev           1.01    1.01
vliw                ~1.00   ~1.00
wc-input1           ~1.00   ~1.00
wc-scanStream       ~1.00   ~1.00
zebra                0.63    0.66
zern                 0.85    0.85
size
benchmark            MLton0    MLton1    MLton2
barnes-hut           64,996    57,604    57,604
checksum             24,225    23,809    23,809
count-graphs         51,937    44,385    44,353
DLXSimulator        106,665    88,137    88,105
fft                  37,393    33,777    33,745
fib                  24,161    23,873    23,873
hamlet            1,399,976 1,102,568 1,102,472
imp-for              24,193    23,841    23,841
knuth-bendix         74,898    64,306    64,306
lexgen              174,513   146,769   146,737
life                 43,969    40,641    40,609
logic                87,073    80,897    80,897
mandelbrot           24,385    23,937    23,937
matrix-multiply      25,217    24,513    24,513
md5                  38,226    33,458    33,458
merge                25,537    25,057    25,057
mlyacc              553,329   468,401   468,401
mpuz                 30,305    28,161    28,161
nucleic              65,537    62,913    62,913
peek                 35,250    31,922    31,922
psdes-random         25,313    24,609    24,609
ratio-regions        59,137    45,025    45,025
ray                 100,360    83,176    83,176
raytrace            273,557   233,973   233,941
simple              206,505   180,425   180,457
smith-normal-form   148,564   138,676   138,644
tailfib              23,841    23,553    23,553
tak                  24,257    23,969    23,969
tensor               67,123    56,755    56,435
tsp                  45,746    38,738    38,738
tyan                103,570    84,786    84,786
vector-concat        25,217    24,513    24,481
vector-rev           24,993    24,481    24,481
vliw                349,633   290,977   290,881
wc-input1            55,754    47,306    47,242
wc-scanStream        56,842    48,266    48,202
zebra               137,778   113,842   113,842
zern                 32,720    30,000    29,968