[MLton] free type variables in datatype decs

Matthew Fluet fluet@cs.cornell.edu
Wed, 2 Feb 2005 14:35:36 -0500 (EST)


I was going to 'tit-for-tat' Andreas with a HaMLet bug report, noting that 
HaMLet does not accept the following program:

val id = fn x =>
   let
      datatype z = Z of 'a
      val unZ = fn Z x => x
   in
      unZ (Z x)
   end;

MLton accepts the program, though I am not entirely convinced that it is 
in keeping with the Definition.  Section 4.6 (Scope of Explicit Type 
Variables) states

  "we still have to account for the scope of an explicit type variable 
  occurring in the ': ty' of a typed expression or pattern or in the 
  'of ty' of an exception binding.  For the rest of this section, we 
  consider such free occurrences of type variables only."

That would appear to (implicitly) rule out scoping an explicit type 
variable from a datatype binding.

Interestingly, the Definition explicitly rules out free type variables in
datatype descriptions in the syntax of Modules, though there is no
analogous restriction in the syntax of the Core.

Andreas' "Mistakes and Ambiguities" posits that the analogous restriction
in the syntax of the Core is necessary; "without it the type system would
be unsound."  I am convincing myself that this is the case.

I believe that MLton is currently incorrectly implementing Rule 28.

At issue is the closure operation applied to the value environment 
elaborated from the conbinding.

Suppose for the moment that one expanded the criteria in "Scope of 
Explicit Type Variables"; then we would attempt to elaborate

val 'a id = fn x =>
  let
     datatype z = Z of 'a
     val unZ = fn Z x => x
  in
     unZ (Z x)
  end;

Applying Rule 29, we obtain the value environment
  VE = {Z -> ('a -> z, c)}

We should apply the closure operation to this to obtain
  Clos VE = {Z -> (#['a].'a -> z, c)}

However, MLton is applying a slightly modified closure operation (one that 
only generalizes over the type variables of the datatype, which in this 
case is empty), yielding
  {Z -> ('a -> z, c)}

If one continues with the Definition's resulting value environment, then 
unZ is elaborated with type  #['b].z -> 'b  and the entire id is 
elaborated with the type  #['a,'b].'a -> 'b.  It is clear how one can 
construct unsound programs in light of such a "cast" function.