[MLton-user] defunctorization

Matthew Fluet matthew.fluet at gmail.com
Tue May 25 14:58:08 PDT 2010


On Tue, May 25, 2010 at 4:32 PM, Jesper Louis Andersen
<jesper.louis.andersen at gmail.com> wrote:
> On Tue, May 25, 2010 at 8:50 PM, Sean McLaughlin <seanmcl at gmail.com> wrote:
>
>> Is there any way to turn that off when I'm debugging, and turn it on again for production compilation?
>
> Matthew will probably correct me if I am wrong: No, there is not.

Correct, there is no way to disable defunctorization.

> Part
> of the lure is that when you strip functors from the program by
> expansion, you make it vastly simpler to compile. Together with
> monomorphisation, which expands polymorhpic functions into monomorphic
> ones, and closure conversion you end up with a first-order monomorphic
> language. The backend of the system relies heavily upon the simplicity
> gained from these transformations.
>
> If you could choose not to expand functors, then the system would have
> to keep track of the parameter to that functor. This in turn
> complicates all the phases of the compiler up till the point you
> eliminate the parameter by making it explicit (like in CC, where a
> higher-order function is turned into a first-order one).

While Jesper is correct that defunctorization and monomorphisation are
crucial for MLton's compilation strategy, there are degrees of
defunctorization that could be exploited.  For example, consider code
like the following:

  functor F() = struct datatype t = T of int; ... end
  structure S1 = F()
  structure S2 = F()

Statically (that is, for type-checking purposes), we need to
differentiate the types of S1.t and S2.t, due to the generativity of
datatype declarations (and opaque signature matches).  However, so
long as the application of F doesn't have any dynamic effects
(allocating an array, allocating a ref, raising an exception, spawning
a thread, capturing a continuation; a conservative check would be that
every value declaration in the body is of a syntactic value, following
the value restriction criteria), then the dynamic behavior of the
above would be equivalent to the dynamic behavior of:

  functor F() = struct datatype t = T of int; ... end
  structure S = F()
  structure S1 = S
  structure S2 = S

In essence, we only need to generate the code for F once, and it could
be shared between S1 and S2.  On the other hand, by duplicating the
code for F, there is the opportunity to optimize S1 independently of
S2.

Even for functors with parameters, there are ways of sharing the code.
 Many ML systems use an implementation strategy where a structure is
(to make an oversimplification) a record of values and a functor is
simply a function from records to records.  MLton doesn't necessarily
duplicate code for something like:

  val F = fn {f, g} => {h = fn z => (f z, g z)}
  val S1 = F {f = fn z => z + 1, g = fn z => z * 1}
  val S2 = F {f = fn z => z + 2, g = fn z => z * 2}

Rather, the code for h (fn z => (f z, g z)) exists in one place, but
each application of F generates a different closure to be used with
the code for h.

So, one could certainly explore other points in the design space
(still within the context of MLton's whole-program compilation),
though it isn't clear how easily this could be explored in MLton's
current implementation.  [Defunctorization occurs during type
checking.  Indeed, one type-checks the body of a functor once when it
is declared and then again each time the functor is applied.]



More information about the MLton-user mailing list