[MLton] free type variables in datatype decs

Matthew Fluet fluet@cs.cornell.edu
Thu, 3 Feb 2005 08:32:15 -0500 (EST)


> > In the 1990 Definition, the syntactic restriction was there, and it
> > was removed from the 1997 Definition.  On the other hand, both the
> > wording in 4.6 and the closure operation in rule 28 are unchanged
> > between the 1990 and 1997 Definitions.  Making a change (explicitly
> > removing the restriction) required conscious effort on the part of the
> > language designers, so I believe the intention was to allow free type
> > variables in datatypes, and that the unchanged wording in 4.6 and
> > closure operation in rule 28 were unnoticed errors resulting from this
> > intention.
> 
> I don't think so, for several reasons: The change is subtle, with subtle 
> consequences, but absolutely no practical relevance - the authors 
> usually were very conservative about changes of this kind. Furthermore, 
> it is not listed as a change in Appendix G. Also, I cannot believe that 
> the authors simply forgot to make at least the obvious adaptions in 
> other parts.

I tend to agree with Andreas.

> > I think MLton's behavior is reasonable (and sound!), given the
> > inconsistency in the Definition.
> 
> Agreed.

Perhaps.  But it appears that MLton is going out of its way to choose a 
solution at odds with every other SML compiler.  And, there is very little 
practical benefit of MLton's behavior over the no-free-type-vars solution.

Note, that the example given in mlton.org/FunctionalRecordUpdate

fun << ({a, b, c}, (f, z)) =
   let
      datatype t = A of 'a | B of 'b | C of 'c
      fun g h z =
         {a = case h z of A a => a | _ => a,
          b = case h z of B b => b | _ => b,
          c = case h z of C c => c | _ => c}
   in
      f {a = g A, b = g B, c = g C} z
   end


can be rewritten as

fun << ({a, b, c}, (f, z)) =
   let
      datatype ('a, 'b, 'c) t = A of 'a | B of 'b | C of 'c
      fun g h z =
         {a = case h z of A a => a | _ => a,
          b = case h z of B b => b | _ => b,
          c = case h z of C c => c | _ => c}
   in
      f {a = g A, b = g B, c = g C} z
   end

without recourse to lifting the datatype to the top-level.  You retain the 
benefit of a local datatype declaration and use.  The only thing you lose 
is preventing use of the datatype at types other than the type-vars of the 
function.