[MLton] free type variables in datatype decs

Stephen Weeks MLton@mlton.org
Wed, 2 Feb 2005 18:58:00 -0800


My take is that the Definition is inconsistent with respect to free
type variables in datatype declarations, and that one can resolve the
inconsistency in (at least) two ways.

1. Add a syntactic restriction to the core that requires tyvars
   occurring on the right-hand side of datatypes to be bound by the
   left-hand side.

2. Extend section 4.6 to include free type variables in datatypes and
   generalize the closure operation in rule 28 to handle such free
   variables.

In the 1990 Definition, the syntactic restriction was there, and it
was removed from the 1997 Definition.  On the other hand, both the
wording in 4.6 and the closure operation in rule 28 are unchanged
between the 1990 and 1997 Definitions.  Making a change (explicitly
removing the restriction) required conscious effort on the part of the
language designers, so I believe the intention was to allow free type
variables in datatypes, and that the unchanged wording in 4.6 and
closure operation in rule 28 were unnoticed errors resulting from this
intention.  I think those errors are easily fixed.

> Section 4.6 (Scope of Explicit Type Variables) states
>
>   "we still have to account for the scope of an explicit type variable 
>   occurring in the ': ty' of a typed expression or pattern or in the 
>   'of ty' of an exception binding.

To fix this, change "an exception binding" to "an exception or
datatype binding".

> I believe that MLton is currently incorrectly implementing Rule 28.
> 
> At issue is the closure operation applied to the value environment 
> elaborated from the conbinding.

Agreed. 

> MLton is applying a slightly modified closure operation (one that
> only generalizes over the type variables of the datatype, which in
> this case is empty)

MLton implements a fixed version of rule 28 where "Clos" is replaced
by "Clos_C", i.e. free type variables aren't closed over, just as in
rule 15.

If one only changes 4.6 but not rule 28, I agree that

> val id = fn x =>
>    let
>       datatype z = Z of 'a
>       val unZ = fn Z x => x
pp>    in
>       unZ (Z x)
>    end;

gives id type scheme #['a,'b].'a -> 'b, which is unsound.


I think MLton's behavior is reasonable (and sound!), given the
inconsistency in the Definition.


BTW, Stefan Kahrs wrote "Mistakes and Ambiguities".  Andreas wrote
"Defects in the Revised Definition" -- and in it notes that the
restriction on core datatype declarations was removed in going from
the 1990 to the 1997 Definition.