performance page

Tue, 9 Oct 2001 11:42:57 -0700

> > So it looks like the new ML-kit run-time performance is still pretty bad
> > (although their simple non-tail function-call speed is faster than ours
> > (seen from fib and tak)).
> 
> This is something i've been thinking about, since I've got a more detailed
> model of transfers right now.  I was thinking about doing "small" function
> calls and returns with values in registers.
...

Sounds good to me.

(Note, in the following, stacks grow upwards in memory, as they currently do in
MLton)

I've been thinking about nontail fib and tak as well, and I believe that we
could do better, even with a stack calling convention.  The problem is that with
our current calling convention, arguments are returned on the stack just above
the return address.  So, in the callee, where the slot for the return address is
dead, you almost always see some shuffle code to move all the return values down
4 bytes.  Without changing calling conventions, we could change the current
register allocator to leave the four bytes dead and avoid the shuffle.  Even
better, we could change the calling convention so that the caller leaves space
for the return values below the return address and the callee places them
there.  Then, no shuffle and no dead space.  And I don't think it requires any
runtime system changes.

I think that this change should be done independently of the calling convention
for small numbers of ints.