[MLton-devel] Fwd: Re: pretty damn good

Matthew Fluet Matthew Fluet <fluet@CS.Cornell.EDU>
Tue, 5 Nov 2002 10:45:10 -0500 (EST)


> Ratio of C back end with default gcc options/native back end:
> 
> (19.935+2.435+19.949+2.433)/(16.923+2.240+16.845+2.343)=1.1669056869
> 
> Ratio of C back end with new gcc options/native back end:
> 
> (17.816+2.273+17.707+2.369)/(16.923+2.240+16.845+2.343)=1.0472999400
> 
> So you're losing 12% just by having the wrong gcc options.  If you use
> computed gotos, then the best gcc options might change again; but I'd give
> up 10-20% performance to get MLton to run on more machines.

nucleic isn't really the benchmark for a general comparison between the
native and c codegens in MLton, because it is so floating-point intensive. 
(But, it may well be for Brad's application domain.)  Floating-point
support and optimizations in the native codegen is there and stable, but a
little adhoc in places; there are certainly improvements that could be
made.  I do feel compelled to point out that nucleic is the one benchmark
in which SML/NJ totally trounces MLton; ratio of SML/NJ (110.41) to MLton
native is 0.4.  Now, given the numbers above, it doesn't appear that gcc
is able to do significantly better than the native codegen in scheduling
and register allocating the floating point instructions, and whatever win
might be there is lost to the more general native vs. c-codegen tradeoffs. 
(As Steve just pointed out, I think the biggest win of native over
c-codegen is better may-alias information which asserts that the
heap-allocated MLton control stack doesn't alias other heap allocated
data-structures.)

In any event, the fact that SML/NJ is so much faster on nucleic than
MLton, and that the blame can't entirely be placed on the native codegen,
suggests that there might be some style of optmization missing in MLton.

Also, it should be pointed out that, while we've discussed it before and
have the very barest outline of a framework in place, nothing keeps 64-bit
floating-point values double word aligned, either during heap allocation,
during a GC move, or on the control stack.  We've often attributed wild
swings in runtimes of floating-point benchmarks to this fact, although
we've never seriously investigated it.




-------------------------------------------------------
This sf.net email is sponsored by: See the NEW Palm 
Tungsten T handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel