Some results

Matthew Fluet fluet@research.nj.nec.com
Tue, 25 Jul 2000 19:34:38 -0400 (EDT)


> Actually, you might look into trying to modify MLton just enough so
> that it doesn't use any Real_ prims.  It might just be possible try a
> self-compile without floating point.

I think the major source of floating point in MLton is going to be from
the tracing/timing portions.  I'll look into turning some of those
functions into nops.  Any advice for playing around with compiling the
compiler?  Last time I started a self-compile from SML/NJ it was up over
half an hour before I killed it off.  

> > So, you can see that for a native compile, we're really doing three
> > invocations of mlton (two of which make calls to gcc), plus two
> > invocations of the assembler.  I suspect that adds up.
> 
> Yeah.  One thing that would be interesting to see would be the time
> for your pass to generate assembly from machineOutput.  That should
> give us a good feel for the speedup.  
> 

It's pretty quick, at least from watching the traces.  Assembly is almost
no time -- probably spends most of its time stripping out all the
comments.  I've also finished converting the C-stub file into assembly, so
now I generate a single assembly file for the whole program.


> > Looking at the
> > compile time for some of the larger benchmarks, there is some decent
> > improvement.  I think we'll see even better performance when we can make
> > the assembler call from within mlton. 
> 
> Separate assembly will help a lot too.

I don't know what the trade offs are there -- smaller individual files
versus more external calls.  It shouldn't be too difficult to play with.
We'll have to decide on how exactly this codegen should interact with the
driver portion of the compiler.

> I think the times are pretty good for round one.  My impression from
> what you wrote is that you feel the lack of liveness information is
> killing us.  There are three solutions to this that I see:
> 
> (1) Do some local liveness analysis on your IL
> (2) Propagate liveness information from the Cps IL down.
> (3) Change the register allocator (backend/allocate-registers.fun) to
>     enforce some invariants that let you know when you can throw away
>     pseudo-regs.
> 
> (1) seems silly, since we already computed the information on the Cps
> IL, and I don't see how you could compute anything better.
> 
> (2) seems feasible to me.  The Cps IL has liveness information at
> every label, and so should be able to give you liveness information at 
> every block.
> 
> (3) Might be ok for a quick hack.

I agree that (1) is probably overkill.  Besides, if we really tackled all
of liveness there, I'd be tempted to work on a more robust register
allocator.

Some combination of (2) and (3) would seem to make sense.