[MLton] CM hacking

Stephen Weeks MLton@mlton.org
Thu, 4 Mar 2004 12:59:38 -0800


> The question to the list is does this idea make sense in the long
> run for MLton? Do people want to adopt CM as the default compilation
> management interface for MLton and do people want some sort of
> separate "compilation" story for MLton in the long run?

We do not want CM as the compilation management interface for MLton.
We currently support CM for two reasons: to make it easier for people
to port from SML/NJ and to have an interim project-management system
until we get our own system in.  Have a look at the thread I sent
earlier for an idea of what that will look like.  Or, if you are
familiar with ML Kit PM files, it is closest to that.

There are a number of reasons why CM is unnaceptable: implicit file
ordering, inability to export types and values, weak language for
scoping imports and exports, unnecessary features, frequent changes,
out of our control ...  I could think of more given time and study of
the CM manual.

So, the possible benefits that I see of the work you are thinking of
doing are

1. A better interim solution than cmcat when cmcat isn't sufficient. 
2. Allow people to develop easily with both SML/NJ and MLton, using CM
   as the project management mechanism.

Matthew mentioned some patches that improve on cmcat and handle some
pretty serious code, so I'm not sure if we need much beyond those to
do (1).  (Matthew, should your patches be checked in?)  Although I
assume even with those patches that cmcat doesn't handle scoping, so
there could be something useful there.  It's just that there aren't
that many big SML programs and the lack of scoping doesn't really seem
to be a problem.  It will be once we start handling separate
distribution of libraries, but we then plan to use our mlb files to
handle the scoping.

As to (2), in addition to the fact that cmcat is often sufficient, we
are moving toward a point where one doesn't need SML/NJ at all.  With
the new front end, we are much closer.  The next big step will be the
byte-code compiler.  Also, there are already enough differences
between the libraries and languages that I don't think different
compilation management systems is one of the bigger impediments to
developing large programs.

As to separate compilation, I prefer not to think of that as a goal,
but rather as a means to an end, since it to me implies a design
choice.  I'd rather think about the goal of fast recompilation, which
is something that MLton lacks.  My current thinking is that we will
address this lack with a byte-code compiler.  I'm not sure exactly how
separately it will compile.  I can imagine a range of options from
completely, requiring no recompiliation even, to something closer to
what SML/NJ does, to something more whole-program.  My gut feeling is
that the way to go to address the situations where MLton is not
currently acceptable is to do something Scheme-ish, with a universal
type, structures as records, and functors as functions with lookup by
name, requiring recompilation only if a source file changes.

So, for the long run, I would prefer to see more work on mlb files
than on CM.  I wouldn't be unhappy to see your stuff work out -- I am
just not convinced the gain is worth the effort.  And it looks like
several weeks, not just a weekend, to me.

> BTW if we get a bytecode interpreter for MLton we can reuse CM to
> manage that part of the system too. I have ideas how we can use a CM
> like system to mix bytecode and native SML code in a very cute way,
> by abusing the CM notion of a "stabilized" library.

The main point of our byte-code compiler is to address the fast
recompilation issue, so I'd rate the goal of mixing of byte-code with
native code pretty low.  It would also be an immense amount of effort,
given MLton's architecture.  And, I don't see what the benefit would
be.  If you want fast compilation, use byte code; if you want fast
execution, compile the whole program.  With a sufficiently fast
whole-program compiler (and I'd rate MLton as pretty fast) there are
very few situations that these two don't handle.