[MLton] free type variables in datatype decs

Andreas Rossberg rossberg@ps.uni-sb.de
Thu, 03 Feb 2005 11:57:19 +0100


Matthew Fluet wrote:
> I was going to 'tit-for-tat' Andreas with a HaMLet bug report

Oh, I hope I'm not being too annoying... :)

> Interestingly, the Definition explicitly rules out free type variables in
> datatype descriptions in the syntax of Modules, though there is no
> analogous restriction in the syntax of the Core.

Note that it was there in the SML'90 edition.

> Andreas' "Mistakes and Ambiguities" posits that the analogous restriction
> in the syntax of the Core is necessary; "without it the type system would
> be unsound."  I am convincing myself that this is the case.

The simplest example possibly is:

   datatype t = C of 'a
   fun cast x = case C x of C x' => x'
   val ouch = 1 + cast "string"

It not even has to be local.


Stephen Weeks wrote:
> My take is that the Definition is inconsistent with respect to free
> type variables in datatype declarations, and that one can resolve the
> inconsistency in (at least) two ways.
> 
> 1. Add a syntactic restriction to the core that requires tyvars
>    occurring on the right-hand side of datatypes to be bound by the
>    left-hand side.
> 
> 2. Extend section 4.6 to include free type variables in datatypes and
>    generalize the closure operation in rule 28 to handle such free
>    variables.

This is not enough, see below.

> In the 1990 Definition, the syntactic restriction was there, and it
> was removed from the 1997 Definition.  On the other hand, both the
> wording in 4.6 and the closure operation in rule 28 are unchanged
> between the 1990 and 1997 Definitions.  Making a change (explicitly
> removing the restriction) required conscious effort on the part of the
> language designers, so I believe the intention was to allow free type
> variables in datatypes, and that the unchanged wording in 4.6 and
> closure operation in rule 28 were unnoticed errors resulting from this
> intention.

I don't think so, for several reasons: The change is subtle, with subtle 
consequences, but absolutely no practical relevance - the authors 
usually were very conservative about changes of this kind. Furthermore, 
it is not listed as a change in Appendix G. Also, I cannot believe that 
the authors simply forgot to make at least the obvious adaptions in 
other parts.

My guess is that they simply screwed up the restriction when they added 
the respective one previously missing for datatype specifications - 
somebody accidentally did a cut&paste instead of copy&paste.


> MLton implements a fixed version of rule 28 where "Clos" is replaced
> by "Clos_C", i.e. free type variables aren't closed over, just as in
> rule 15.

That does not seem to be what MLton really implements, because it 
correctly rejects the non-local example as well, where 'a does not 
appear in C (note that this would not be captured by the side conditions 
in rule 87 either, because the closed environment would not contain 'a 
free).

As far as I can see, specifying the more liberal rules correctly 
actually requires changing the Definition to do proper binding analysis 
for type variables, instead of vacuously accepting them in type 
expressions and verifying there use after the fact. In particular, this 
implies a change to rule 44, requiring alpha to be in U of C, and to all 
rules that bind type variables, in order to extend U appropriately. I 
reckon that is what MLton does internally. (At least it is what the 
Alice compiler does. And I never really understood why the Definition 
uses such a fragile approach instead.)


> I think MLton's behavior is reasonable (and sound!), given the
> inconsistency in the Definition.

Agreed.

> BTW, Stefan Kahrs wrote "Mistakes and Ambiguities".  Andreas wrote
> "Defects in the Revised Definition" -- and in it notes that the
> restriction on core datatype declarations was removed in going from
> the 1990 to the 1997 Definition.

Indeed.

-- 
Andreas Rossberg, rossberg@ps.uni-sb.de

Let's get rid of those possible thingies!  -- TB