[MLton] GC help

Stephen Weeks MLton@mlton.org
Sun, 14 Aug 2005 09:35:07 -0700


> My applications have lots of live data and use lots of side effects.
> I would like to understand the effect that different G.C. policies
> might have on the running time.

There is a little bit of info at

  http://mlton.org/GarbageCollection

The first thing to try is running your application with 

  @MLton gc-summary --

This will tell you how much time is spent in GC, and break the
collections down by type so you know which kind are happening.

What matters is not so much the absolute amount of live data, but the
amount relative to the memory available.  By default, MLton is very
aggressive in grabbing memory so it can keep a high ratio (16) of heap
size to live data size.  This enables it to cost-effectively use
copying collection.  Only when there isn't enough memory available
does MLton switch to generational collection.  And only when memory
gets really tight does it switch to mark compact.

If there's enough memory available, you should only see about 10-15%
of your time spent in GC, and no further tuning will be required.

> Is there a way to turn generational collector off completely (use just
> the copying collector)?

Yes, you can compile with 

  -mark-cards false

This causes the mutator code to not insert card marking statements at
pointer assignments.  Since the generational collector relies on card
marking, it is thus unable to run.  -mark-cards false has the added
benefit of not doing the card marking statements, although we've never
measured a significant cost to them (more than a percent or two).

Beware in coming from the SML/NJ world, where the GC seems to cost a
lot and occasionally exhibit pathological behavior (this from informal
observations).  Card marking in MLton is very fast, and the
generational GC is too.  So even in the odd circumstance when
generational GC does kick in, you won't see the kinds of problems you
might with SML/NJ.

> In general, what are the ways to control G.C.? 

The documented controls are at

  http://mlton.org/RunTimeOptions

The undocumented controls are in processAtMLton() in runtime/gc.c in
the sources.

  http://mlton.org/cgi-bin/viewsvn.cgi/mlton/trunk/runtime/gc.c?view=markup

Relevant undocumented controls are:

  copy-generational-ratio
  grow-ratio
  live-ratio
  mark-compact-generational-ratio
  mark-compact-ratio
  nursery-ratio

You can use this to tune when (and whether) the GC switches between
different collection strategies.  If you want details on any specific
undocumented control and it isn't clear from gc.c, I can have a look
and figure out exactly what's going on -- but I don't know them off
the top of my head.