[MLton] Re: MLton mailing list

Mon, 08 Mar 2004 12:08:17 +0100

Hi there,

>> I'm currently looking at improving the recompilation system in the
>> ML Kit by serializing import and export environments to disk, which
>> in principle could influence our PM system. One problem is that the
>> improvement requires a dependency analysis, which, as you know, is
>> difficult for Standard ML (because of toplevel value constructors
>> and infix declarations).
>>
> Stephen Weeks <sweeks@sweeks.com> writes:
>
> I'm not sure what you see the problem as, but I'd be happy to discuss
> it.  You might be interested in looking at the output of a new option
> we just added to MLton, -show-def-use, which gives dependency
> information on constructors, variables, types, signatures, structures,
> and functors.

I'm just referring to the problem of identifying identifier status of
value identifiers in patterns. Checking for recompilation upon
modification of source code requires tracking of identifier status of
identifiers being bound in patterns. Consider the two program units
A.sml and B.sml:

   A.sml:
     val a = 5;

   B.sml:
     val a = 8;

It would seem perfectly fine to infer that B.sml does not depend on
A.sml. However, consider now modifying the program unit A.sml to read:

   A.sml:
     datatype t = a;

If A.sml and B.sml should be understood as a Standard ML program, this
modification should cause B.sml to be reelaborated, which of course
results in a type error. In my Ph.D. thesis, the concept of
elaboration dependence was developed for a subset of Standard ML,
including a simple form of datatypes. See [1, Chapter 4] for some
formal properties. Notice also that your -show-def-use flag for MLton
does not say that B.sml depends on the value identifier a having
identifier status 'v'...

>> So what happened to the ML Basis files - it seemed like a good idea?
>
> I agree :-).  I still plan to do it at some point.  I spent most of my
> free time over the last 6 months working on MLton's new front end.
> BTW, I'd love to get your feedback on trying that out on the ML Kit.
>
> My current plan is to do the mlb stuff this summer.  But I'm not sure
> if I will find the time.

[The following discussion is partly a followup on
http://www.mlton.org/pipermail/mlton/2003-July/014395.html and
http://www.mlton.org/pipermail/mlton/2003-July/014421.html]

I have decided to implement something like the mlb stuff for use with
the ML Kit. I have now got the serialization of elaboration and
compiler bases into place (see [2]) and now I need a language for
specifying dependencies between program units. The language should
satisfy the following properties:

 1. The language should be mostly orthogonal to the Definition of
    Standard ML.

 2. The static semantics of the language should be clearly defined
    (e.g., with inference rules in the style of the definition and
    scope rules for infix directives). The semantics can refer to
    judgments of the form B |- topdec => B', where topdec is a
    Standard ML top-level declaration and B and B' are static bases.

 3. The dynamic semantics of the language should be clearly defined
    (e.g., with inference rules in the style of the definition and
    scope rules for infix directives). The semantics can refer to
    judgments of the form B_dyn |- topdec => B_dyn', where topdec is a
    Standard ML top-level declaration and B_dyn and B_dyn' are dynamic
    bases.

 4. The language should support binding of bases (e.g., with basis
    identifiers). In principle, augmenting the PM system with this
    feature allows for any acyclic dependency graph to be described.

 5. Support for renaming of structure, signature, and functor
    identifiers would be great, but not very important.

With these properties, it would be possible to construct a tool
(mlbdep) that takes a mlb-file, reads all the .sml-files, and
constructs a new optimized mlb-file with the same static and dynamic
semantics...

A few comments to the static semantics of mlb-files as suggested in
http://www.mlton.org/pipermail/mlton/2003-July/014421.html: Perhaps it
would be nicer not to modify the static basis of the Definition but
instead introduce a new basis map (M say) mapping basis identifiers to
bases. Also, it would be nice to have the expansion be part of the
specification. BTW, it appears that your mlb-semantics allows nested
bases, like {bid1 -> {bid2 -> B, F, G, E}, F', G', E'}, to implement
the mlb-file caching performed by the expansion. Was this the
intention?

It would be great if we could end up agreeing on a language and a
semantics?

Cheers,

Martin

[1] Martin Elsman. Program Modules, Separate Compilation, and
Intermodule Optimisation. PhD thesis. Revised. Department of Computer
Science, University of Copenhagen. January 1999. Available from
http://www.it.edu/people/mael/mypapers/phd.ps

[2] Martin Elsman. Type-Specialized Serialization with Sharing. IT
University of Copenhagen. IT University Technical Report
Series. TR-2004-43. February, 2004. Available from
http://www.it.edu/people/mael/mypapers/ITU-TR-2004-43.pdf