Some results

Matthew Fluet fluet@research.nj.nec.com
Wed, 26 Jul 2000 18:26:49 -0400 (EDT)


> > Last time I started a self-compile from SML/NJ it was up over
> > half an hour before I killed it off.  
> 
> In what pass was it?  It definitely shouldn't take this long to
> generate C.  You should make sure the flags are set as in the script
> below or as in the Makefile.  For a simple test, from within the src/
> directory, you should be able to do "make nj-mlton && make".

I think it's just the fact that this machine is a 200Mhz PentiumPro.  It
took about 45 minutes for nj-mlton to create the mlton.c file.

I went ahead and "inlined" the Thread_switchTo macro into assembly and
eliminated the GC_switchToThread function and Thread_switchTo1 macro.

Trying out the thread-switch.sml program, the inlined version is much
faster (that was a given), but it's not quite as fast as the inlined
version for the c-codegen, either with or without global-pseudo-regs.
It's a little surprising to me.  Looking at the assembly for
Thread_switchTo in both the x86-codegen and the c-codegen, the x86 uses
fewer instructions and less memory traffic, as far as I can tell.  The
difference might be that gcc hoists some of the loads and speeds up the
pipeline.


Also, looking at an integer-only self-compile:

For the original mlton.c, 
grep "\(RD\)\|\(SD\)\|\(Real\)" mlton.c | wc -l  ==>  326

After eliminating a call to Time.toString
grep "\(RD\)\|\(SD\)\|\(Real\)" mlton.c | wc -l  ==>  86

(And, as a bonus, since Time.toMilliseconds doesn't require floating
point, I've set it up so a verbose trace still prints out the time.  The
downside is that this requires a call to IntInf.toString, but I can handle
that.)

The remaining operations on reals seem to be creeping in from the
compiler's basis library files.  For example, there is one call to
Real_Math_ln in the mlton.c file.  As best I can make out, it's
originating in library/basic/real.sml and the line

val ln2 = ln two

Now, for some reason, this isn't being uselessed away, even though the
only use of ln2 is in library/basic/real.sml
   
fun log2 x = ln x / ln2

and I can't find any call to log2 anywhere in mlton.sml.

Since eliminating the Real.fmt call that was hiding behind the
Time.toString got rid of about 3/4 of the floating point operations, I'm
guessing that there isn't any single function that is responsible for the
remaining fp ops.