MLton.size bug

Matthew Fluet Matthew Fluet <fluet@CS.Cornell.EDU>
Thu, 26 Apr 2001 18:46:14 -0400 (EDT)


> The MLton.size bug is not a runtime bug.  It is due to an interaction with the
> useless analysis that sometimes (I have not had the time to figure out when)
> notices that the argument to MLton_size is useless and removes it, thus causing
> the size to be zero.  As an example, if you run mlton -v3, you will notice that
> the size of the hash table is always displayed as zero.

So, I looked into this a little bit.  The remove-useless pass is also
causing complications with MLton_size.  For example, consider the
following program (the _ffi's are just to force some output and also to
make the .cps program manageable): 

val printInt = _ffi "printf" : string * int -> int;
val printString = _ffi "printf" : string * string -> int;

datatype t = T of int list

val l = List.tabulate(100, fn i => i)
val s_l = MLton.size l
val _ = printInt ("%i\n\000", s_l)
val x = T l
val s_x = MLton.size x
val _ = printInt ("%i\n\000", s_x)
val total = List.foldl (op +) 0 (case x of T l => l)
val _ = printInt ("%i\n\000",total)

I modified useless.fun to do a deepMakeUseful on all the arguments of a
MLton_size primitive, but the program still yields:

1200
0
4950

What I discovered was that remove-unused determines that the T
constructor is useless, and replaces it with a dummy constructor:

before removeUseless:
t_11 = T_10 of (list_1)
...
val x_1292: t_11 = T_10 (x_1285)
val x_1293: t_11 ref = Ref_ref(t_11) (x_1292)
val x_1294: int = MLton_size(t_11) (x_1293)

after removeUseless:
t_11 = dummy_0
...
val x_1292: t_11 = dummy_0 ()
val x_1293: t_11 ref = Ref_ref(t_11) (x_1292)
val x_1294: int = MLton_size(t_11) (x_1293)

Simplify types goes on to change the t_11 type to unit.

Does this have to do with the fact that MLton_size is one of the few
polymorphic primitives?  I notice that remove-unused.fun specially treats
MLton_eq and MLton_equals -- presumably because a datatype whose elements
are only every compared for equality cannot be eliminated by collapsing
the entire datatype into a single dummy constructor.

Anyways, I tried mimicing the MLton_equals case, and that results in the
expected output for the above program, still using deepMakeUseful.  On the
other hand, if I don't use deepMakeUseful, then the second item is still
0, because useless changes the type of t_11 to t_11 = T_10 and sets
val x_1292: t_11 = T_10 ()
val x_1293: t_11 ref = Ref_ref(t_11) (x_1292)
val x_1294: int = MLton_size(t_11) (x_1293)

Not being too familiar with the useless pass, this is my take: MLton_size
shouldn't mark it's arguments as useful, but it seems to me that when we
rewrite the program, we could change the type of the MLton_size call to
the useful-type of it's argument and extract the useful components of the
argument.  Is this similar to what is done with arrays when some of their
components aren't useful?