x86 update

Matthew Fluet fluet@CS.Cornell.EDU
Tue, 12 Dec 2000 21:25:34 -0500 (EST)


> > - improved verifyLiveInfo pass;
> >   it's about 5X - 8X faster than before;
> 
> On the self compile, this sped up from about 36s (on 12/7) to 23s.

Hmm.  I suspect that the speed improvement I observed partially arose from
comparing G0 versions of the compiler (i.e., really comparing SML/NJ
running times).  Also, a self-compile will likely have a longer
verifyLiveInfo pass, because there are cases were the liveness information
propagated from backend.fun needs to be revised, so some blocks might need
to be processed multiple times.  But I doubt that even perfect liveness
information would reduce the time significantly.

> Matthew, I noticed a lot of uses of List.concat and List.map in your code, and a
> lot of other opportunities for "deforestation" or more efficient uses of list
> operations.  In general, List.fold should be used if possible since it does a
> single loop over the list.  For example, in x86-peephole, there is a call
> 	(List.concat l) @ l'
> which could be more efficiently implemented as
> 	List.fold (rev l, l', op @)
> There are lots of other examples.  It should be possible to cut down allocation
> quite a bit in the backend by going through the code with an eye for such
> things.

O.k.  I'll definitely look into that.

> 	    peepholeLivenessBlock_minor totals 2.130
> 	       elimDeadDsts_minor: 0
> 	       elimSelfMove_minor: 0
> 	       elimFltSelfMove_minor: 0

I'm considering removing this pass.  In all the regressions and
benchmarks, there is exactly one instance where an optimization is made.
On the other hand, it doesn't seem to be a particularly expensive pass.

> 	 allocateRegisters totals 398.270
> 	    toLiveness totals 209.460
> 	    toNoLiveness totals 0.0
> 	    Assembly.allocateRegisters totals 187.900
> 	       Instruction.allocateRegisters totals 102.470
> 		  pre totals 23.720
> 		  post totals 35.890
> 		  allocateOperand totals 21.870
> 		  allocateFltOperand totals 0.0
> 		  allocateFltStackOperands totals 0.0
> 	       Directive.allocateRegisters totals 24.360

This pass really needs some improvement.  Hopefully, some of those list
improvements will improve the toLiveness pass.