[MLton] free type variables in datatype decs

Matthew Fluet fluet@cs.cornell.edu
Thu, 3 Feb 2005 15:39:32 -0500 (EST)


> Currently, I lean slightly toward accepting this construct, but I
> could go either way.  I don't think the Definition makes the choice
> clear, and there are other arguments on both sides.  Here's what I
> see.
> 
> Allow free variables in datatypes:
> 
>  + MLton 20041109 does it, so taking it out might break existing
>    programs.  
>  + Because local datatypes are allowed, programmers will naturally use
>    it, and wonder why it's not there if it's not.

Disallowing free variables in datatypes does not take away local 
datatypes.

>  + Simple semantics, understandable to programmers.

I think the stance the Definition takes is that datatype and type decs are
distinctly different from val decs.  For example, the Definition disallows
shadowing of explicit type variables in val decs, but not in datbinds or
typbinds:

(* Rejected *)
val 'a f = fn x =>
  let val 'a g = fn (x : 'a) => x
  in g x
  end

(* Accepted *)
val 'a f = fn x =>
  let datatype 'a g = G of 'a
  in case G x of G x => x
  end

(* Accepted *)
val 'a f = fn x =>
  let type 'a t = 'a
  in x : 'a t
  end

Conversely, the Definition (and MLton) will infer a scope for explicit 
type variables in valbinds but not in datbinds or typbinds:

(* Rejected *)
datatype g = G of 'a

(* Rejected *)
type t = 'a

(* Accepted *)
val g = fn (x : 'a) => x


I guess what I find potentially confusing about MLton's behavior is that 
it scopes type variables from {val,ex,typ,dat}binds only at {val}binds.  
For example, in the Definition and MLton,

val f = fn (x : 'a) =>
  let val g = fn (x : 'a) => x
  in g x
  end

is equivalent (can be transformed) to

val g = fn (x : 'a) => x
val f = fn (x : 'a) =>
  let
  in g x
  end

but, while these are accepted by MLton, 

val f = fn (x : 'a) =>
  let datatype g = G of 'a
  in case G x of G x => x
  end

val f = fn (x : 'a) =>
  let type t = 'a
  in x : t
  end

they cannot be transformed to

datatype g = G of 'a
val f = fn (x : 'a) =>
  let
  in case G x of G x => x
  end

type t = 'a
val f = fn (x : 'a) =>
  let
  in x : t
  end

The fact that one is forced to perform manual type var binding when 
lifting a datatype or type declaration but not a val declaration is a 
source of non-uniformity.

Of course, the related transformation with an exception binding, say

val f = fn (x : 'a) =>
  let exception E of 'a
  in (raise E x) handle E x => x
  end

is not allowed.


All of this to say that I don't think any solution gives rise to a uniform 
treatment and I'm not convinced that MLton's solution is a "simple 
semantics" that is any more understandable than the HaMLet syntactic 
restriction.


A very different solution (and consequently language) would be to
uniformly apply the last bullet of Section 2.9 Syntactic Restrictions to 
all tyvarseq binding forms and the Section 4.6 Scope of Explicit Type 
Variables behavior to all binds.  Namely, a free type variable in 
{val,ex,dat,typ}bind gets scoped at the declaration in a program if (1) 'a 
occurs unguarded in this declaration, and (2) 'a does not occur unguarded 
in any larger (value, since a {val,ex,dat,typ}bind may only be nested in a 
valbind) declaration containing the given one.

For example,

datatype g = G of 'a
==>
datatype 'a g = G of a'

(* Rejected, due to shadowing *)
val 'a f = fn x =>
  let datatype 'a g = G of 'a
  in case G x of G x => x
  end


val f = fn x =>
  let datatype g = G of 'a
  in case G x of G x => x
  end
==>
val f = fn x =>
  let datatype 'a g = G of 'a
  in case G x of G x => x
  end

val f = fn (x : 'a) =>
  let datatype xyz = X of 'a | Y of 'b | Z of 'c
  in ...
  end
==>
val 'a f = fn (x : 'a) =>
  let datatype ('b, 'c) xyz = X of 'a | Y of 'b | Z of 'c
  in ...
  end

In the last two examples, MLton (and I presume MoscowML) floats the free
type variables from the datatype out to the val binding.

I'm not suggesting this as behavior MLton (or any other SML compiler) 
should adopt, but simply put it out there as one way of applying uniform 
behavior to all tyvar binding forms.