[MLton-user] Profiling the live heap?

Thu May 5 06:32:14 PDT 2011

On Wed, May 4, 2011 at 10:13 AM, Wesley W. Terpstra <wesley at terpstra.ca> wrote:
> On Wed, May 4, 2011 at 3:19 PM, Matthew Fluet <matthew.fluet at gmail.com> wrote:
>> It might not be too difficult to tweak the garbage collector to
>> compute a histogram (indexed by object representation type) of the
>> live heap.
>
> This would be quite easy I think.

Agreed.

>> One difficulty with associating fine-grained allocation information
>> with objects is that the object header reserves 19 bits for the index
>> into the object types table.  2^19 is (supposedly) more than
>> sufficient for the number of different representations of objects
>> within a single program, but it probably falls short of the number of
>> different allocation sites in a large program.
>
> Note, however, than on a 64-bit machine we have 32 otherwise wasted bits in
> the object header.

True.  Though, I would be wary of a feature that worked exclusively in 64-bit.

> Also, allocation sites != allocated type, due to
> polymorphism. So the object type might not be what you want.

It's true that polymorphism will result in multiple object types
sharing the same allocation site.  But, my suggestion of tying
allocation site to object type is that it is data that is already
associated with every heap object.

> Does MLton currently combine types with identical structure into a single
> structure?

It can certainly combine types that were originally of different
types, but are optimized to the same type, into a single intermediate
representation type.  For a simple example:

  val x : Word32.word * Word32.word = (0wx1, 0wx2)
  val y : Int32.int * Int32.int = (1, 2)

These two tuples have distinct SML types, but are internally
represented by the same type.  Indeed, I the second tuple would be
common-subexpression-eliminated.

More interesting is when MLton determines that components of tuples
are unused, so the intermediate representation types might be become
equal.

However, it is true that currently two datatypes like:
  datatype t = A of int | B of bool
  datatype u = C of bool | D of int
will induce four different object types, one for each variant.  But, I
actually think that is a "bug", because object types just care about
the representation (size, bytes non-pointers, number pointers), so the
above types should share the object types.

> If it does combine types, can this be turned off?

Not completely, for reasons noted above --- by the time that
representations are computed, some distinct source types will have
become equal intermediate representation types.

> I am imaging an optional
> output table which records the same type name you would see in a MLton error
> message for each object type.

Maybe, but as you note, the source level type for a particular
allocation might be expressed as a polymorphic type, not a ground
type.

> My proposal would then be:
> 1) Add an object type index -> textual type name table
> 2) Add a new field 'allocation site' that uses some of the spare bits in 64-bit object headers.
> 3) Add a table that maps allocation site index -> source file location
> 4) Add a runtime option to increment counters on GC against these tables
> 5) Provide some mechanism to print out the live heap histogram information, broken

2 and 3 are relatively straightforward, although a slight variation on
this proposal would be to add the allocation site index to the object
type, inducing a distinct object type for every allocation site and
simply use more bits for the object type index with 64-bit headers.
(The advantage of this is that "small" programs that don't have more
than 2^19 representation types and allocation sites could still be
profiled under 32-bits.)

1 is a bit more difficult, because one would need to thread the
front-end type all the way through the computation.  Perhaps a little
easier would be to annotate each allocation site with its
source-location/source-type.