[MLton-devel] Twelf and MLton

Stephen Weeks MLton@mlton.org
Sat, 1 Feb 2003 15:49:13 -0800


Hi guys.  I am following up on the Twelf SML benchmark discussion from
December.  As a reminder, the discussion started with a post by Chris
Richards on comp.lang.ml pointing out a performance problem in SML/NJ
110.41 that was causing Twelf to slow down by a factor of 10, as
compared to SML/NJ 110.40.

	http://groups.google.com/groups?dq=&hl=en&lr=&ie=UTF-8&safe=off&threadm=ataiff%24hod%241%40cantaloupe.srv.cs.cmu.edu&prev=/groups%3Fhl%3Den%26safe%3Doff%26group%3Dcomp.lang.ml

Next, Frank sent two mails comparing Twelf as compiled by various SML
compilers.  The first was of five consecutive runs in the same SML
session of a TALT example from Karl Crary.  This showed an exponential
slowdown over the runs for SML/NJ 110.0.3, 110.40, a linear slowdown
for Poly/ML 4.1.3, and no slowdown for MLton.

>                 #1	  #2      #3     #4       #5
> SML/NJ 110.0.3  24.460  24.490  77.660 484.930    ?
> SML/NJ 110.40   13.771  38.188 176.690   ?        ?
> Poly/ML 4.1.3   34.320  53.410  75.810 102.900  128.250
> MLton 20020923   6.500   7.860   6.330  15.800    6.680

The second mail showed three different examples, without the
consecutive runs.

>                 TALT   Regress  Leak
> SML/NJ 110.0.3  87.050 105.210  24.460
> SML/NJ 110.40   53.270  50.540  13.771    
> Poly/ML 4.1.3  136.570 122.840  34.320
> MLton 20020923  57.460  60.050   6.500

To make sure I understand correctly, the explanation of the slowdowns
from the first email was that there are space leaks in SML/NJ 110.0.3,
110.40, and in Poly/ML 4.1.3.  Also, there is likely a (different)
space leak in SML/NJ 110.42 that caused Chris to make the original
post.

I would like to try to duplicate these numbers on my machine.  I have
successfully compiled Twelf 1.4 alpha on my machine, so all I need is
the Twelf code.  Can you make the Twelf code for (some of) those four
benchmarks available?  I would be most interested in Regress, since
that is the one MLton performs worst on.  I would like to add that one
to our benchmark suite.

I also want to make y'all aware of the latest experimental version of
MLton, 20030130, available at http://www.mlton.org/experimental.  This
version has support for the latest basis library spec as well as the
ability to do source-level time and allocation profiling.  I know
Frank had expressed some interest in space profiling, which are
we are still thinking about.  In the meantime, I thought you might
like to try out the new time and allocation profiling.  

To whet your appetite, I profiled twelf-server running on all the
examples in the examples dir that comes with Twelf.  Here are the top
10 functions.

                    function                       cur  stack  GC 
------------------------------------------------- ----- ----- ----
whnf  src/lambda/whnf.fun: 281                    10.7% 12.9% 1.4%
startCPUTimer  <basis>/system/timer.sml: 7         6.8%  9.0% 0.0%
whnfRoot  src/lambda/whnf.fun: 238                 4.3% 10.5% 0.6%
time  src/timing/timing.sml: 75                    1.4% 87.1% 5.3%
matchExp  src/cover/cover.fun: 385                 1.1%  5.9% 0.0%
fmtExpW  src/print/print.fun: 338                  0.9%  5.7% 0.3%
assignable  src/compile/assign.fun: 177            0.9%  7.6% 0.6%
rSolve  src/meta/search.fun: 201                   0.8%  9.3% 0.8%
matchClause  src/cover/cover.fun: 528              0.8% 10.5% 0.6%
checkCPUTimer  <basis>/system/timer.sml: 18        0.8%  6.2% 0.0%

In the table, rows correspond to a source functions (with name and
source position).  "cur", "stack", and "GC" are, respectively, the
percentage of time spent in that function, the percentage of time
spent with that function on the stack (i.e. in it or a nontail
callee), and the percentage of time spent in garbage collection with
that function on the stack.

It is also possible to display a call graph of the profiling data.  I
have attached one to this message.  In the graph, nodes correspond to a
source functions and edges correspond to nontail calls.  The three
percentages are as in the table.

One warning: this version of MLton meets the 2002 basis library spec
(http://standardml.org/Basis/), which has made some changes since the
old spec.  For the time being, mlton has a switch, -basis 1997, that
lets you use the old basis.  You will need to use this until you
update Twelf, which uses some of the basis functions that have
changed.

We would be grateful for any feedback you could provide on MLton
20020130, especially on the new features, since we are planning to do
a public release soon.  We would also be interested to hear if the
profiling tools help you find some low-hanging fruit that lets you
speed up Twelf.

Thanks.


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel