[MLton] bool array size

Stephen Weeks MLton@mlton.org
Thu, 16 Mar 2006 13:47:00 -0800


> Using the 20051201 version of MLton I see that an array of bools takes 4
> bytes per entry (plus the array header).  Why isn't this 1 byte?  

This is a peformance bug.  The problem is unique to booleans, and
doesn't show up with other enumerated types.  For example, the array
in the following code will have one byte per element.

----------------------------------------------------------------------
datatype t = A | B
val a = Array.tabulate (100, fn i => if i mod 2 = 0 then A else B)
val () =
   print 
   (case Array.sub (a, 0) of
       A => "A\n"
     | B => "B\n")
----------------------------------------------------------------------

Booleans are special because they appear in the FFI and hence their
external representation is fixed (4 bytes, 0 = false, 1 = true).
Unfortunately, the representation pass uses the 4-byte representation
internally.  The right thing to do is to use a 1-byte representation
internally and to coerce to/from the 4-byte representation when
crossing the FFI.  I think it would be reasonably easy for someone to
put this fix into the representation pass, probably with a day or
two's work.

> Is this part of the same problem that causes large array elements to
> always cause an extra indirection to get to the element?

I'm not sure what you mean here.