[MLton] Question about Definition of SML :)

Stephen Weeks MLton@mlton.org
Fri, 28 Nov 2003 12:02:59 -0800


> I have a question which is not due to MLton - but maybe you guys
> know this. Especially maybe Matthew knows after implementing
> the front-end? :)

Actually, I implemented the new front end.

> Anyway - I'm not sure about the purpose of the restriction:
> 
> tynames VE \subseteq T of C
> 
> in the rulle (26) p. 24 in the Definition of SML.

The side conditions on rules 4, 14, 17, 19, and 26 are intended to
force type inference to choose new names for types declared in
datatype declarations.  There is some explanation for rules 17 and 19
in section 5.1 of "The Commentary on Standard ML" (which refers to the
analogous rules 19 and 20 of the old definition).  The side conditions
on rules 4, 14, and 26 were added to the new definition based on
observations made by Stefan Kahrs in section 9 of "Mistakes and
Ambiguities in the Definition of Standard ML"
(http://www.cs.kent.ac.uk/pubs/1993/569/index.html).

> I can see the purpose of the limitation:
> 
> tynames \tau \subseteq T of C
> 
> in rule (4) p. 21. It prevents that one returns
> datatypes out of the scope where they are defined.

Right.  The example Kahrs gives is

   let
      datatype A = C of bool -> bool
   in
      fn C x => x true
   end
   let
      datatype B = C of int
   in
      C 2
   end

Without the side condition on 4, the above program would be accepted,
because the type inference engine could choose the same type name for
A and B.  All SML implementations reject this program, but for
different reasons.  Hamlet, Poly/ML, Moscow ML and the ML Kit
implement the side condition, causing both of the let expressions to
be rejected separately.  MLton and SML/NJ do not implement the side
condition, and rely on the fact that the type inference engine chooses
new type names for the datatypes, and hence the unification for the
application fails.  Consequently, MLton and SML/NJ incorrectly accept
expressions like the following.

   let
      datatype A = C of bool -> bool
   in
      fn C x => x true
   end

There is no soundness reason for rejecting this expression -- it is
rejected only due to an artefact of how type generativity is
"implemented" in the definition.  Personally, I think the definition
should have been written to explicitly maintain type name sets and
more directly enforce generativity, which would allow the above.

In practice, I don't know how much it matters.  I've never heard of
any examples where someone was porting code from SML/NJ to another
compiler and ran into the restriction.  I'd be interested to hear of
any.

If pressed, I could probably figure out a way to add the side
condition to MLton.

> As far as I can see, the restriction:
> 
> tynames VE \subseteq T of C
> 
> in rule (14) p. 23 does the same to ensure that
> one doesn't unpack an exception with a datatype
> value which was raised out of scope from where
> it was declared.

As far as I know, this side condition and the one on rule 26 are
completely unnecessary from a soundness point of view.  That is, all
the programs they cause to be rejected will run without "going wrong"
in the type soundness sense.  Could you please explain if you have an
example for 14 that shows otherwise.

The consequence of this rule is briefly discussed in the example on
the bottom of page 4 in section 4 of "Defects in the Revised
Definition of Standard ML"
(http://www.ps.uni-sb.de/hamlet/defects.pdf).  I agree with his
conclusions there.  Only the ML Kit of all the SML implementations
rejects the following program as required by the side condition on
rule 14.

val _ =
   let
      val x = ref []
      datatype a = B
      val _ = x := [B]
   in
      ()
   end

> However, isn't the rule (26) redundant then? The
> restriction should be there already because of rule (14)
> (because an exception will have to go through a handle clause
> to re-enter the program).
> Or is there something special about the recursive
> binding? And in case it's not redundant, why not?

I agree with you that the side condition on rule 26 is redundant, but
only due to a subtle point (possibly a mistake).  If the C + VE on the
top had been written with a circled plus instead of a plain plus, then
the side condition would not be redundant.  With a circle plus, the
context used to elaborate the valbind could include type names defined
by the valbind, so if the side condition weren't there then the
following program would be accepted.

val rec f =
   fn x =>
   let
      datatype A = C of bool -> bool
   in
      fn C x => x true
   end

There are two ways to correct the rule: remove the side condition, or
change the circle to a circle-plus.

> And shouldn't it be in rule (25) as well then?

Adding the side condition on rule 25 is redundant because it is
already there on rule 14, and there is no circularity to work around
as in rule 26.

> More concretely: I would very much like to see a
> program which fails to type-check due to the
> limitation in rule (26).

If you change the plus in 26 to a circle plus, then the val rec above
will be rejected due to the side condition.