[MLton-devel] nucleic benchmark

Matthew Fluet fluet@CS.Cornell.EDU
Thu, 7 Nov 2002 11:52:12 -0500 (EST)


> MLton0 -- mlton.cvs.HEAD -native true
> MLton1 -- mlton.cvs.HEAD -native false
> SML/NJ -- SML/NJ
> run time ratio
> benchmark MLton1 SML/NJ
> nucleic     1.23   0.61

I took a really brief look at a time profile for nucleic.  As expected,
the lion's share of the time are in floating-point intensive blocks (>20
f.p. primitive ops).  I kept the assembly for both the native codegen (.S
files) and the C codegen (.s file).  Interestingly, while gcc is identical
in the "real" work, it is significantly better at managing the f.p.
register stack and reducing memory traffic.

[fluet@lennon temp]$ grep "\(fmul\)\|\(fadd\)\|\(fsub\)\|\(fdiv\)" *.S | wc -l
    207
[fluet@lennon temp]$ grep "\(fmul\)\|\(fadd\)\|\(fsub\)\|\(fdiv\)" *.s | wc -l
    207
[fluet@lennon temp]$ grep "\(fld\)\|\(fst\)" *.S | wc -l
  12825
[fluet@lennon temp]$ grep "\(fld\)\|\(fst\)" *.s | wc -l
   1007
[fluet@lennon temp]$ grep "\(fxch\)" *.S | wc -l
   6612
[fluet@lennon temp]$ grep "\(fxch\)" *.s | wc -l
    290

One thing to note is that any move of a floating-point value uses the
floating-point registers; so just copying tuple elements from one tuple to
another will require bouncing through a float reg.  Back in Jan. 2002, I
looked into replacing some of those mem-mem moves to use integer
registers; the results went both ways -- nucleic sped up by 8%, mandelbrot
slowed down by 13%, and everything else was pretty minor, so it never
stayed in the code.



-------------------------------------------------------
This sf.net email is sponsored by: See the NEW Palm 
Tungsten T handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel