[MLton-user] Mostly MLton performance.

Johan Grönqvist johan.gronqvist@gmail.com
Sat, 25 Mar 2006 20:10:48 +0100


Hi,

I am trying out MLton and am very impressed. I now have a floating-point
intensive application that runs in MLton at about the same speed as in
C++ (compiled with -O2) when run once. It is a very simple one that
could very well be coded in Fortran, but as I find SML more elegant than
the Fortran I have seen, I like to use it instead.

However, there are some things I wonder about, and I would be interested
in knowing why MLton behaves this way.

If I run the MLton compiled program several times, running times
increase significantly, as in:

johan@shallow-blue:~/lek/sml$ for ((ix = 0; ix < 10 ; ix++)) do { time
./canti/allAtom2d; } ; done ;

real    0m49.535s
user    0m41.688s
sys     0m0.084s

...

real    1m41.737s
user    1m15.934s
sys     0m0.773s

real    1m20.823s
user    1m1.709s
sys     0m0.206s

I have not seen this with other compilers.

My first guess was that the GC may behaves differently in different
runs, but fixed-heap did not eliminate the difference, and using
gc-summary I can see that the GC only takes slightly more time in the
longer runs (the differences in "total GC time" are less than one
second). When the program is run on a computer that has "rested" (low
load average last minute or so) it seems to always take roughly the same
time.

The second point is that when removing checks for nan-ness on reals, I
get a significant speedup. I would like to use nan-ness as a kind of
flag (instead of real option) as I expected it not to give me a large
penalty. In C++ using gcc I feel there is a much smaller penalty (around
 4 seconds) compared to MLton (around 15 seconds), and my gcc compiled
program produces reasonable results.

In a post to the devel-list Stephen Weeks wrote about a faster version
of Real.abs. Could this be related to my question? Is there a faster way
of doing things that is not used due to standard compliance? Is this
discussed somewhere?

As a third topic I have used the ffi to connect to the gsl (gnu
scientific library) in order to use Bessel functions and other special
functions. As long as they return doubles that is no problems, but there
are more interesting things available there.

I think I read in a recent message to the list that the ffi is being
reworked. I would be interested in knowing if it is an aim extend the
number of types that can be passed as parameters to c-functions. In
particular I would be interested in records or tuples and functions
(i.e., as c-function pointers). Specifically, what I have in mind are
functions returning a result and an error estimate (as tuples or
records) and minimizers (taking (among other things) a function as a
parameter).

Thanks for any help!

I also want to thank you for the good compiler as well as the very
helpful topics on the homepage (for-loops, printfs and that sort of
things), showing that with MLton I can get all that I am used to, but in
a slightly more general and type-checked form.

/ johan


Ps. In the "fold" page on the MLton homepage (liked from the for-loop
page), the term "eta expand" is used without any reference. Is this the
same as the eta-reduction mentioned in the wikipedia article on lambda
calculus? In that case I think that a link to the wikipedia article
might be helpful for beginners like me.
Ds.