[MLton-devel] Re: SML Basis review

Sun, 24 Aug 2003 11:40:22 -0700

> By wc-*, I meant the first set of benchmarks (i.e., with neither F nor S
> suffix) where only Imperative IO is used.  MLtonA always beats MLtonB in
> this case.  That was the source of my claim that lifting the Stream IO
> operations to Imperative IO isn't as efficient as the Buffer I
> implementation (so long as you stay within Imperative IO).

Makes sense.

> > So, it's not clear to me why we use Fast instead of Naive.
> 
> In Oct 2002, when I first introduced the StreamIO functor, Stephen
> claimed:
> I'm not convinced that even with a lot of effort you can get the
> imperative-layered-on-functional approach to be as fast as the current
> imperative approach.
> 
> Granted, since then, I have put a lot of effort into improving the
> StreamIO layer, mostly trying to bring it up to speed with the old stream
> IO layer.  In any event, I'm perfectly happy using Naive instead of Fast.
> There are probably still a few improvements that could be done on the
> StreamIO layer.

I am very impressed at how close Naive has come to Fast.  I had
naively :-) expected that Naive would be closer to SML/NJ speeds and
that most of the reason why we beat them on wcish benchmarks was our
BufferI approach.  But your numbers show otherwise.  We even easily
beat them using the same imperative-layered-on-functional approach
that they do.

I was also playing a bit of Devil's advocate when I said "it's not
clear to me why we use Fast instead of Naive".  Looking at just the
wc* becnchmarks, which is certainly the common case, there is still
enough of a gain for Fast that we should keep it.

> Look at the ssa for wc-input1{,F,S}.  wc-input1F.ssa has the following
> datatype:
> 
> instreamP_0 = Buffer_0 of ...
> 	    | Stream_0 of ...
> 
> while neither of the other two have such a datatype.  

Interesting.  I hadn't realized that the wc-input1S could eliminate
the datatype.  I had expected there to initially be a buffer in the
ref and then it get switched to a stream, thus confusing the
analysis.  It's nice to see that I was wrong.

> Well, I wouldn't give too much credit to the wc-*{,S,F} benchmarks.  The
> "work loop" is too small and MLton just optimizes the tight loop.

Yeah.  Maybe it's one of those cliffs that indicates we need better
high-level strategies for optimization like profile direction and
noticing when optimization is quiescing.

-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines
at the same time. Free trial click here:http://www.vmware.com/wl/offer/358/0
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel