[MLton] free type variables in datatype decs

Andreas Rossberg rossberg@ps.uni-sb.de
Fri, 04 Feb 2005 14:14:28 +0100


Stephen Weeks wrote:
>>In fact, that is not the case: exception declarations may contain free type
>>variables, as long as they do not propagate to the toplevel.
> 
> I agree that the rules allow this.  However, as with your other
> example, MLton rejects such declarations as inconsistent with the
> intent of section 4.6 -- they have no val dec to scope the type
> variables.

So you suggest that the intent of 4.6 is that a program may be rejected 
on the sole ground that the rewriting rules given there fail to 
introduce binders for some free variables?

I think this is really stretching the interpretation much too far. 
Section 4.6 only describes a program transformation to be applied before 
elaboration. It does nowhere say that the transformation function was 
incomplete and may require rejecting some programs.

Btw, the example in the comment to rule 20 strongly hints at such 
exception bindings being considered legal by the authors.

So, while I agree with most of your sentiment about how SML *should be* 
defined, it seems very obvious to me that this is not how it actually 
*is* defined. On the other hand, I have no problem with a compiler 
allowing scoped tyvars like MLton does, but I think they clearly 
represent a (marginal) language extension.


>>I have another, deeper technical argument: allowing val-bound type
>>variables in type declarations destroys a very strong design
>>invariant of SML, namely that the denotation of an explicit type
>>declaration never depends on any implicitly passed type information.
> 
> I'm afraid I can't get much meaning out of this.  Consider the
> following two fragments
> 
> A.
>    type u = int
>    type t = u * u
> 
> B.
>    fun 'a f (x: 'a) =
>      let 
>         type t = 'a * 'a
>      in ...
> 
> In what way does the definition of t in (B) depend on "implicitly
> passed type information" more than the definition in (A)?  Both simply
> refer to types (which are in scope).

Well, I was referring to dynamic type information. Maybe an example 
makes this clearer. In Alice ML you can write an expression like (I'm 
simplifying a bit)

   val p = pack 5 : int

Here, "pack" is a keyword that evaluates to a value of the built-in type 
package, which essentially is a dynamic value. I.e. it contains a value 
along with the dynamic representation of its type (more generally, a 
module and its signature). The inverse operation

   unpack p : ty

verifies dynamically that the runtime type found in p matches the static 
annotation.

Now, the type information for the package must be constructed at 
runtime, basically by "evaluating" the type expression. Since it can 
refer to other type constructors, we have to compute dynamic 
representations for all type bindings in a program. This is fine, since 
there are relatively few. And it allows runtime types to be passed to 
functors, for example.

But what happens if we allowed free tyvars in type bindings?

   fun 'a f (x : 'a) =
       let
	  type t = 'a
       in
           pack x : t
       end

We would need the type 'a is bound to dynamically, so we could no longer 
do *any* type erasure (in other words, we completely loose 
parametricity, which currently is only lost for modules).

The issue is already apparant in the language definition. I have 
specified the semantics of packages as a diff to the Definition, and it 
was very straight-forward (I can make it available if you want). If we 
allowed the code above, I could no longer do that in the given framework 
(or any variation of it), because its core dynamic semantics are 
type-erased as well.

> How about the following fragment.
> 
> C.
>    functor F (type a) =
>      struct
>         fun f x = 
>            let 
>               type t = a * a
>            in ... end
>      end
> 
> This is very similar to (B) -- I've just lifted the polymorphism up to
> the module level.  This is valid SML, and again I don't see how the
> type information in (B) is any more implicit.

In (B) you don't have to (and cannot) supply the type explicitly when 
applying f. That's precisely the difference. On the module level, all 
type information is explicit: type bindings are represented in the 
dynamic semantics of SML, but polymorphic type variables of value 
bindings are not.

Note that it is actually impossible in SML to compute all implicit type 
information dynamically, because it may require access to types that are 
out of scope (related to the fact that SML generally has no syntactic 
principal signatures). By relying on explicit type information 
exclusively for the dynamic semantics, this does not arise as a problem.

> I'm treading on thin-ice understanding here, since MLton's ILs are
> typeful.  But, it seems like type erasure basically means using a
> universal type.  If that's the case, then why can't you simply replace
> uses of 'a with that universal type when you erase its binding
> occurrence.

Because dynamic type checking with unpack has to be able to discover the 
proper type. We truly have type-dependent evaluation, so abstraction 
tricks do not apply.

OK, as I said, all of this is not directly relevant to MLton, but for me 
the strict separation of polymorphism and type declarations is an 
important point in SML's design. But this probably has gone off-topic by 
now, so I'll stop here.

-- 
Andreas Rossberg, rossberg@ps.uni-sb.de

Let's get rid of those possible thingies!  -- TB