x86 performance

Stephen Weeks MLton@sourcelight.com
Wed, 9 Aug 2000 10:02:42 -0700 (PDT)


> >     The  C  compiler  uses  leal  as cheap 3-address arithmetic while the x86
> >         version uses a move  followed  by  an  add  constant.   Note,  the  C
> 
> It's certainly easy to find moves followed by an add constant, but knowing
> that they correspond to pointers isn't information that's available.

Why does it matter whether or not it's a pointer?  Can't you just always use
leal for adds of small constants?  Maybe you could even do this directly in
translate, without needing peephole.

> I wouldn't call this a peep-hole optimization, but it's certainly
> something I hope to be able to support.  Part of this should fall out of
> eliminating redundant jumps.  Also, since we have the pseudo-regs live at
> entry for each block, it's possible to pass those values between blocks in
> real registers, rather than saving and restoring them.  Unfortunately,
> that's going to hurt in terms of register-register moves given the way the
> register allocator is written; since I'll need to shuffle the pseudo-regs
> from wherever they end up living to where I want them for the next block.

Yeah.  The right thing to do is to process the basic blocks in some kind of
dfs postorder so that when you process a block you have already processed its
targets (most of the time).  Then you can know where you want stuff and try to
put it there.  Appel talks about this a little in Section 13.7 of Compiling with 
Continuations.  Section 9.7 of the dragon book talks about this as well.