[MLton] free type variables in datatype decs

Stephen Weeks MLton@mlton.org
Thu, 3 Feb 2005 15:52:37 -0800


> >  + Because local datatypes are allowed, programmers will naturally use
> >    it, and wonder why it's not there if it's not.
> 
> Disallowing free variables in datatypes does not take away local 
> datatypes.

Sorry, I didn't mean to take away local datatypes.  I meant that once
datatypes are local, it's natural for them to refer to other locally
bound entities (types, tyvars, ...).  For example, suppose a
programmer writes a function like

fun f (x: int) =
  let
     datatype t = T of int * int
     val z: int * int = (x, x)
     val _ = T z
  in 
     ...
  end

Then one day, he realizes f doesn't depend on the type of x and can be
generalized, so he writes

fun f (x: 'a) =
  let
     datatype t = T of 'a * 'a
     val z: 'a * 'a = (x, x)
     val _ = T z
  in 
     ...
  end

Disallowing the datatype declaration, while allowing the type
constraint on z, will (correctly) seem arbitrary to the programmer.
If 'a is in scope, why is he allowed to write 'a * 'a in one place and
not in the other.  Equally confusing would be the fact that the
following program is allowed.

fun f (x: int) =
  let
     type u = int
     datatype t = T of u * u
     val z: u * u = (x, x)
     val _ = T z
  in 
     ...
  end

In this program, as in the one before it, the right hand side of the
datatype refers to a type that is in scope.  To accept one and not the
other creates a bizarre distinction between type-variables-as-types
and other types.

BTW, a slight modification makes clear why we must treat type
definitions and datatypes the same way.  Suppose we enforced the
restriction on datatypes but not on type definitions.  Then one could
write

fun f (x: 'a) =
  let
     type u = 'a
     datatype t = T of u * u
     val z: u * u = (x, x)
     val _ = T z
  in 
     ...
  end

I am even more nervous about (re)imposing the restriction on local
types than I am datatypes, because local types have a trivial
semantics: substitution.  Also, local types don't cause any of the
hair that datatypes do with type constructors occurring outside the
scope of their definition.

> I think the stance the Definition takes is that datatype and type
> decs are distinctly different from val decs.

I agree, this does seem to be part of the problem.

> I guess what I find potentially confusing about MLton's behavior is that 
> it scopes type variables from {val,ex,typ,dat}binds only at {val}binds.  

Yeah.  I think Andreas has the right answer here.  The rules for
implicit scoping of tyvars only scope at valbinds because type and
datatype application are explicit, while type application of values is
implicit.  I guess one could make implicitly-scoped tyvars for
{typ,dat}binds consistent by having their application consist of an
implicit component (for the implicitly scoped tyvars) and an explicit
component (for the explicitly passed arguments), but that sounds messy.

Exceptions are a different story.  You can't scope there because it
would either be useless or unsound.  Consider if 

  exception E of 'a

gave E the type forall 'a: 'a -> exn.  Then, every occurrence of E
would generate a different exception constructor.  There's no way to
match up uses of E as a constructor with uses of E as a pattern
(except to be unsound).  The point is that an exception declaration is
nonexpansive (there's a ref cell hidden somewhere) and so it never
makes sense to scope tyvars there.

> The fact that one is forced to perform manual type var binding when 
> lifting a datatype or type declaration but not a val declaration is a 
> source of non-uniformity.

Not at all.  Closing over free type variables is exactly the same as
closing over free value variables.  Your examples missed the
connection because the values you lifted had no free value variables.
To modify one of your examples slightly, consider lifting g in

 val 'a f = 
   fn x: 'a =>
   let val g: 'a -> 'a = fn y: 'a => (x, y)
   in g x
   end

This requires closing over g's free type variable ('a) and its free
value variable (x), yielding

 val 'a g1 = fn x: 'a => fn y: 'a => (x, y)
 val 'a f = 
   fn x: 'a =>
   let val g: 'a -> 'a = g1 x
   in g x
   end

All of this is not as clear as it could be because of SML's syntax in
which type application is implicit.  But, there is an exact
correspondence between the type abstractions and applications and
value abstractions and applications in the lifting of g.

Similarly, to lift a type definition, one must close over all its
types that will not be defined in the new scope.  For example,

 val f = fn (x : 'a) => let type t = 'a * int in ...

becomes

 type 'a t1 = 'a * int
 val f = fn (x : 'a) => let type t = 'a t1 in ...

> All of this to say that I don't think any solution gives rise to a uniform 
> treatment and I'm not convinced that MLton's solution is a "simple 
> semantics" that is any more understandable than the HaMLet syntactic 
> restriction.

It's true that nothing could be simpler to understand on the surface
than a local syntactic restriction.  But I think that imposing that
restriction leads to unjustifiable behavior (to the programmer) in
other places.

I still don't see what's nonuniform about what MLton does (other than
the existing rule of implicitly scoping only at valbinds, which will
be there in any case).