[MLton] cvs commit: world no longer contains a preprocessed basis library

Stephen Weeks MLton@mlton.org
Wed, 10 Dec 2003 09:46:06 -0800


> In that case, when I write a lib, then we'd
> parese/elaborate/type-check the lib, producing a set of declarations
> that would be included if a final program ever included that lib.

That's exactly what I was thinking.  Preprocessing and -write-lib
don't do any dead code.  That only happens after we have the whole
program.

> Unsafe or just not-semantics preserving?

Not semantics preserving.  Although, with the basis library
implementation doing various unsafe things, not-semantics-preserving
introduces more chances for unsafeness.

> And I'm not sure that file-level granularity will work out.  For example,
> if I use MLton.Random.rand but never use threads, will I be forced to
> include MLton.Thread because  basis-library/mlton/mlton.{sig,sml}
> references  basis-library/mlton/thread.{sig,sml}.  Or, worse, if we keep
> the Basis2002 structure, then using the basis will include everything,
> because everything is referenced through the basis.sml file.

Right.  File-level doesn't work for the basis library.  And I guess
that is some argument for why it won't work with user libraries as
well.

However, the two examples you give above would go away with the mlb
files, since there is no need (or at least less need) for grouping all
the structures together in a single (MLton or Basis2002) structure.
Instead, one can group things at the mlb level.  In some sense that is
more uniform, since it also allows grouping of functors and
signatures.  So, a programmer could use $/basis.mlb (or whatever
anchor notation we use) instead of "open Basis2002".

> I thought the dead-code criteria was pretty straight-forward, for library
> code.

Here's the description from basis-library/README.

----------------------------------------------------------------------
The dead code elimination includes the minimal set of
declarations from the basis so that there are no free variables in the
user program (or basis).  It has a special hack to include all
bindings of the form 
        
        val _ = ...
----------------------------------------------------------------------

The implementation is quite short, at less than 100 lines.  So, yes,
it is simple.

However, it can have pretty strange effects and I'm not sure we'd want
it running on user code.  For example, consider the following toplevel
declarations.

        val c = ref 0
        fun inc (): int = (c := 1 + !c; !c)
        val x = inc ()
        val y = inc ()

Under this dead-code elimination, the value of y depends on whether x
is used or not.

I think file granularity will be more understandable to a programmer,
not least because it will be easy for MLton to produce the list of
files that it used to build the program.  There is also the example of
CM, which uses file granularity.  Also, we want portability to other
platforms, which is easy to do with files, but not so easy with
declarations after defunctorization.

It is questionable whether the basis library can be made to work with
file granularity, but I would certainly like to try.  And if it can't,
I could believe that having two kinds of dead-code elimination is the
way to go.


This is all still very open, and I don't plan to start implementing
until January at the earliest, so keep the ideas coming.