[MLton] HOL's Moscow ML implementation, and pushing MLton to emulate it

Fri, 1 Apr 2005 14:16:20 -0500 (EST)

> > I would have thought that the reason that they didn't want this sort
> > of approach is that in addition to having the _code_ for every
> > theory in the executable, this would require the _data_ for every
> > theory resident in the heap at all times.
> 
> That would hurt.  Maybe it could be avoided by loading dummy libraries
> with trivial theorems (of negligible size) for those theories that
> will be unused.  But that definitely doesn't support dynamic loading
> of theories, and so one will need all the theories in an executable
> that exposes an interpreter.

I'm confused, though part of my confusion stems from not knowing exactly
what role a value of type 'thm' plays in the HOL system.  From what
Michael has said, I get the feeling that a 'thm' value has no executable
content, it is simply a term representation.[1] So, why isn't dynamic
loading of theories simply dynamic reading of this term rep from disk?  
Sure, on has to bind those term reps into the interpreter, but that should 
be close to trivial.

Stephen wrote:
> Yes.  The thing to keep in mind is that whatever you want to run fast
> all has to be SML source in the same program.  In your (or your
> users') case, that means the HOL kernel and any accompanying theory
> code.

Right, but the point being that 'thm' values don't _run_, so you can't 
make them any faster by including them in the SML source to be compiled.

Dan Wang's idea of cooperating programs is probably another reasonable way
to go.  The insight there is that there is really no interaction between
code supporting linear arithmetic over reals and code supporting a list
theory.  So, there isn't anything for mlton to exploit when it compiles
the whole program.  I suspect that most of this code is doing a lot of
symbolic manipulation (of 'thm' and 'hol_type' terms), which mlton will
happily improve.  If you were to adopt a cooperating programs aproach, you
would likely need to pass HOL REPL environments back and forth between the
programs, plus time to start a separate-executable (which may itself need
to read theories off of disk), so in terms of end-to-end speed-up, it may
be a wash.

[1]
Now, there is another question about what it is a term representation
_of_.  As far as I can tell, HOL does not have a separate proof checker,
so a 'thm' is not a proof of the theorem, it is simply the statement of
the theorem.  So, the proof of the theorem is an ephemeral object, 
implicit in the execution of the HOL REPL that constructs a 'thm' object.  
There is a sort of meta-theory reasoning principle that allows one to 
conclude that having a 'thm' object is tantamount to the truthhood of the 
theorem it represents.  Now, this opens up a number of other issues about 
actually 'believing' a HOL proof.  If simply having a 'thm' object lets me 
conclude that the theorem is true, then you need to worry about hacking 
the on-disk format to forge 'thm' objects that correspond to false 
statements.  If the proofs disappear, there really isn't a paper trail to 
check.

But, maybe I'm way off with the way HOL is meant to work.