Congratulations...

Stephen Weeks sweeks@intertrust.com
Wed, 22 Dec 1999 11:02:32 -0800 (PST)


Thanks for the positive report.

> However, there were four bugs in the compiler that I had to work around,
> which was quicker and easier for me than to report the bugs to you and
> wait for them to be fixed. The first three were easy, but the last one
> with a message like
> 
> Bug: value primApply type error
> 
> was difficult to work around. The cause turned out to be a function
> converting a sorted vector to a splay tree with the bug only manifesting
> itself when each element in the vector had a very complex data type.

When you get the time, if you could send the four programs that tickle
the bugs, I would be grateful.  Bugs explicitly reported by the
compiler are usually pretty easy to find and fix -- as opposed to bugs
in the generated executables, which are often very difficult to fix.

> The biggest performance problem with MLton executables is the garbage
> collector which should be rewritten to be generational. The current memory
> management requires almost twice as much RSS as SML/NJ and Harlequin
> MLworks.

Yes, it's bad.  Right now, the runtime shoots for a factor of 16 ratio
of heap size to live data.  The only (mildly) mitigating factors are
the max-heap and fixed-heap runtime system options, which limit the
memory usage.  Or, I suppose you could tweak s->liveRatio in gc.c.

It would definitely be nice to have a generational gc.  As to the
compiler, I think it would only require a few easy changes to the
backend.  But having never written a generational gc myself, I don't
have a good feel as to how long it would take to write the gc.

> Also, the compilation process is memory consuming, which may be
> unavoidable when doing whole-program analysis. I had to use the
> -no-polyvariance -flatten and -inline flags to cut down the memory
> consumption of MLton itself even when using the SML/NJ compiled version
> since it otherwise required about 1GB RAM to compile my sml code.

Unfortunately, I suspect that when using these options (as we had to
for the kit and the self-compile) the performance of the generated
code is pretty bad.  I've looked at the ILs and the generated C in
these cases and there is a lot of stupid stuff going on.