x86 performance

Matthew Fluet fluet@research.nj.nec.com
Wed, 9 Aug 2000 10:13:08 -0400 (EDT)



> Note, converting to the C-style code (at least in this case) is trivial for a
> peep-hole optimizer.

I'm not sure that I entirely agree.  Certainly for the decl statement,
it's not clear to me that the other two could be handled by peep-hole
optimizations (at least in the framework where I'm working.)

>     The  C  compiler  uses  leal  as cheap 3-address arithmetic while the x86
>         version uses a move  followed  by  an  add  constant.   Note,  the  C

It's certainly easy to find moves followed by an add constant, but knowing
that they correspond to pointers isn't information that's available.
(Another argument for RCPS and TAL?)  In this particular case, I can
actually get by with modifying the translation of the limitCheck to use a
leal instead of the movl/addl combo.  But, are there other cases where I
want a leal?  (I'm thinking in terms of the statements in the machine IL
that don't have a lot of supporting code.  Since there aren't really any
pointer comparisions, maybe there aren't.)

>     The  C  compiler kept a value in a register while the x86 had to store it
>         into memory and then reload it.  Also the x86 code used  an  absolute

I wouldn't call this a peep-hole optimization, but it's certainly
something I hope to be able to support.  Part of this should fall out of
eliminating redundant jumps.  Also, since we have the pseudo-regs live at
entry for each block, it's possible to pass those values between blocks in
real registers, rather than saving and restoring them.  Unfortunately,
that's going to hurt in terms of register-register moves given the way the
register allocator is written; since I'll need to shuffle the pseudo-regs
from wherever they end up living to where I want them for the next block.


> You didn't say what kind of machine you were running on, but lets suppose  it

It's a 200 MHz PPro.