[MLton-devel] cvs commit: types for Rssa

Stephen Weeks MLton@mlton.org
Sun, 8 Dec 2002 17:18:37 -0800


> I was originally going to ask what was wrong with doing the cast as part
> of the case statment, in that the argument of the destination label would
> rebind the test variable at the new type.  But, I see now that that the
> straightforward translation into Machine would require a move from the
> test variable to the argument, which is really redundant.

Right.  So what actually happens is there is a move (with a cast) in
the nullary label that is the destination of the case branch.  I had
initially tried leaving that move as an implicit part of the switch
(something like arith) even in MACHINE but that complicated the
codegens.  So I figured it was easier to make the move explicit and to
someday fix the type checker to notice (via dominators) that the cast
is ok.

> >   Combined all the switch statements used by Rssa and Machine into a
> >   single datatype -- see backend/switch.sig.  With that and the changes
> >   to operands, Rssa and Machine are starting to look suspiciously
> >   similar.  Hopefully one day we will be able to unify them.
> 
> What are the major differences between RSSA and Machine right now?
> Mostly the distinction between stack and backend registers?

RSSA uses variables while MACHINE uses registers and stack offsets.
Variables and registers are essentially the same thing, with the
difference being that variables can be live across nontail calls,
while registers can't.  But I think we can push that difference into
the type checker, unify variables and registers and view register
allocation as a pass that enforces the invariant that variables are
not live across nontail calls.

Another difference is that RSSA groups blocks by function while
MACHINE groups blocks by chunk.  The main difference is the RSSA
grouping makes the information about the returns and raises of a block
implicit in the function the block is in.  That should be easy to fix
by attaching the raises and returns info to every block.  Then we can
unify the notions of function and chunk.  

There's a few other minor differences, like some operands that are in
one but not the other, but mostly I think what's left is pushing
through all the details.

> >   The backend register allocation is no longer attempts to share a
> >   register for multiple variables.  This may cause performance problems
> >   since the local{char,int,...}  arrays used by the native codegen to
> >   cache real registers will no longer be as small or as densely used.
> 
> Have you run any benchmarks against something from before the merge.

The benchmarks were fine (see below), but oddly enough, despite that,
the self-compile performance was horrible.  For example, here's what
I saw on my usual test machine.

   Compile SML starting
      pre codegen starting
      pre codegen finished in 118.24 + 31.22 (21% GC)
      x86 code gen starting
      x86 code gen finished in 338.65 + 68.80 (17% GC)
   Compile SML finished in 456.89 + 100.02 (18% GC)

I just put in some simple register sharing code, not as good as what
was there before, and was able to recover some of the performance.
It's amazing to me that the register sharing makes this much
difference.

   Compile SML starting
      pre codegen starting
      pre codegen finished in 107.97 + 38.83 (26% GC)
      x86 code gen starting
      x86 code gen finished in 138.15 + 55.34 (29% GC)
   Compile SML finished in 246.12 + 94.17 (28% GC)

I think that's worse than what was there before, so I'm going to
retrofit the old register sharing to the new RSSA/MACHINE.  It
shouldn't be too bad.

Anyways, here's the benchmarks (with no register sharing at all).  The
one problem with tensor was in MLton0, not MLton1.

MLton0 -- /usr/bin/mlton
MLton1 -- mlton

run time ratio
benchmark         MLton1
barnes-hut          1.00
boyer               0.92
checksum            1.00
count-graphs        1.03
DLXSimulator        1.00
fft                 1.00
fib                 1.01
hamlet              0.93
imp-for             0.95
knuth-bendix        0.88
lexgen              1.05
life                1.05
logic               0.99
mandelbrot          1.00
matrix-multiply     1.00
md5                 1.00
merge               0.98
mlyacc              1.00
model-elimination   1.01
mpuz                1.07
nucleic             0.93
peek                1.00
psdes-random        0.99
ratio-regions       0.99
ray                 0.99
raytrace            1.03
simple              1.00
smith-normal-form   1.00
tailfib             1.00
tak                 1.00
tensor             ~1.00
tsp                 1.03
tyan                1.00
vector-concat       1.05
vector-rev          1.00
vliw                0.96
wc-input1           1.00
wc-scanStream       0.99
zebra               1.10
zern                1.00

size
benchmark            MLton0    MLton1
barnes-hut          104,080   113,328
boyer               141,303   135,991
checksum             44,551    46,927
count-graphs         66,399    64,815
DLXSimulator        102,208   103,248
fft                  53,595    55,563
fib                  44,591    46,991
hamlet            1,227,840 1,240,128
imp-for              44,511    46,927
knuth-bendix         87,136    87,728
lexgen              172,653   166,205
life                 64,815    66,999
logic               104,631   106,919
mandelbrot           44,623    47,079
matrix-multiply      45,127    47,503
md5                  53,720    55,816
merge                45,879    48,311
mlyacc              535,501   506,829
model-elimination   634,288   622,416
mpuz                 50,519    51,943
nucleic             191,999   194,407
peek                 52,760    53,776
psdes-random         45,727    47,671
ratio-regions        62,999    65,287
ray                 104,224   109,232
raytrace            278,701   275,869
simple              200,587   201,691
smith-normal-form   181,924   187,748
tailfib              44,319    46,767
tak                  44,751    47,135
tensor                    *   111,163
tsp                  59,728    60,944
tyan                107,648   107,424
vector-concat        45,087    48,103
vector-rev           44,911    47,311
vliw                323,353   320,953
wc-input1            66,733    68,613
wc-scanStream        67,213    69,341
zebra               143,272   155,112
zern                 51,330    52,866


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel