MLton.size bug

Stephen Weeks MLton@sourcelight.com
Fri, 23 Mar 2001 10:21:57 -0800 (PST)


> What is the MLton.size bug?  If it's not
> too difficult, I'll try and get that one too, as I'll probably be
> modifying the runtime anyways.

The MLton.size bug is not a runtime bug.  It is due to an interaction with the
useless analysis that sometimes (I have not had the time to figure out when)
notices that the argument to MLton_size is useless and removes it, thus causing
the size to be zero.  As an example, if you run mlton -v3, you will notice that
the size of the hash table is always displayed as zero.

The problem is in the semantics of MLton.size.  Consider the following code.

val l1 = [...]
val l2 = [...]
val x = (l1, l2)
val n = MLton.size x
val m = List.length (#1 x)

Assume m and n are useful in the rest of the code, but that l2 is not.  What is
the right size to report for x?  zero, the size of l1, or the size of l1 + l2?
Notice that if MLton does its usual shrink reduction, it will produce the
following.

val l1 = [...]
val l2 = [...]
val x = (l1, l2)
val n = MLton.size x
val m = List.length l

Now the only way the useless analysis would mark x as useful is because it is
used by MLton.size.  But, I would like MLton.size to be "transparent" to any
analyses, so it maybe the right thing to do in this case is report a size
of zero.

I really dislike the hashtable size getting reported as zero, but I also don't
want MLton.size to cause fields of data structures that otherwise would have
been marked as useless to be marked useful.

Another problem is that the primitive MLton.size is of type 'a ref -> int.  The
ref is there so that we guarantee that the runtime system gets a pointer.  The
basis library code to implement MLton.size is in
src/basis-library/mlton/mlton.sml.

fun size x =
   let val refOverhead = 8 (* header + indirect *)
   in Primitive.MLton.size (ref x) - refOverhead
   end

Of course, the ref that is allocated here is completely useless.  The analysis
in useless.fun marks enough of the argument to MLton_size as useful so that the
type doesn't change (see the calls to allOrNothing).  But that sometimes leaves
everything but the ref as useless.

Anyways, MLton.size clearly needs a little thought.  Hopefully there is
something simple to be done.

One other possibility is to add enough "reflective" primitives to the language
so that MLton.size (and other things) could be written in SML.  That would be
cool, but may be an orthogonal issue.