[MLton] CM hacking

Daniel C. Wang danwang@CS.Princeton.EDU
Wed, 03 Mar 2004 20:39:14 -0500


Okay, after a little thinking, I've given up on the GenSML.gen approach, and 
am contemplating a slightly more complicated but preferable approach in the 
long term.

CM is factored so that it doesn't really need to know too much about the 
details of object code, environments, compilation, and linking. These are 
all pretty abstract concepts. If they aren't I'm going beat on CM until they 
are. So really there are two phases "compilation" and "linking". Compilation 
take sources into object files. Linking combines object files into programs.

Imagine a trivial "compilation" step where you just take the source program 
and do nothing but record some dependency info about free identifiers. Next 
the linking phase takes the object files parses and alpha renames them to 
avoid conflicts and just passes it off to mlton to deliver an executable.

The next step will be of course to keep enough environment info around so 
you can run the frontend on partial programs and at least do typehecking 
separately.

Once we get this working we can slowly move more work into compilation so 
that the linking phase does less and less work. In the end we can probably 
do all the parsing and elaboration of SML in the compilation phase. Anyway, 
this is potentially no longer a weekend hack, but still very tractable.

The question to the list is does this idea make sense in the long run for 
MLton? Do people want to adopt CM as the default compilation management 
interface for MLton and do people want some sort of separate "compilation" 
story for MLton in the long run?

BTW if we get a bytecode interpreter for MLton we can reuse CM to manage 
that part of the system too. I have ideas how we can use a CM like system to 
mix bytecode and native SML code in a very cute way, by abusing the CM 
notion of a "stabilized" library.

I think, hacking most of this is not much more work than refactoring bits of 
  CM. The parsing and renaming of SML code can easily reuse large parts of 
MLton. The MLkit could also probably take advantage of a properly abstracted CM.