[MLton-devel] IBM Research PL day

Wed, 8 May 2002 08:47:29 -0400 (EDT)

> >         Perry Cheng, IBM Research:
> >         A Defragmenting, Mostly Non-moving Garbage Collector

Here are the notes I took during Perry's talk yesterday.

- Goal
  + real-time GC
  + uniprocessor ==> incremental
  + small memory footprint
- Previous Work on RTGC
  + copying -- typical heap ~= 3 to 4X live data
  + mark-sweek -- fragmentation can be unbounded
- fragmentation: available but unusuable memory
  + (standard diagram)
- incremental compaction
  + redirection
    * consistency via read barriers
      # eager: variables contain pointers to replica
      # lazy: variables contain pointers to either copy
    * collector responsibility
      # eager: must update stack refs to point to replica
               incremental via stacklets
      # lazy: do nothing
    * eager "better" because of dynamic counts
    * lazy avoids potentially unnecessary dereference
    * null references: CondRedirect
    * optimizing (eager) barriers by sinking
      # combine implicit null check with conditional redirect
    * optimizing (lazy) barriers by CSE
- fragmentation
  + heap divided into 16K blocks (on linked free lists);
    each block has a slot size x and divided into 16K/x slots
    large arrays not segmented (no further discussion of arrays)
  + types of fragmentation
    * internal --> wasted slot
    * block internal --> wasted slots at end of block
    * external --> partially utilized blocks
  + controlling internal & block fragmentation
    * choose block size and slot sizes
      geometric progression for slot size based on internal 
      fragmentation rate;
      (this went by a little fast; essentially, what I think is going on
       is that as time goes on, each block becomes a container for smaller
       and smaller objects.  Note, the internal fragmentation rate is set
       from the outside, not computed dynamically.)
  + controlling external fragmentation
    * sort blocks by occupancy
    * move objects out of low occupancy blocks
    * fix references
    * return empty blocks to free list
  + measuring fragmentation
    * distinguish between recently dead and mummified
      (i.e., with an incremental GC, only fair to consider objects that
       were marked dead during last full heap trace)
- Results
  + worst case fragmenation never realized on Java SPEC benchmarks
  + memory footprint: 1.1 to 1.4X live data
  + read barriers practical with optimizations

Something Perry didn't make clear from the beginning (although it came up
briefly with the discussion of null references) is that the target
system is for (embedded) Java.  This was made more explicit when John
Reppy asked about the space cost of the redirection pointer; for ML where
most objects are cons cells of 2 words, an extra word for objects is a big
deal; for Java where most objects are 10 to 20 words, the extra word isn't
a big deal.  It was pointed out that you can play games with Java object
headers and not pay too much extra.

Some general comments:
 the uniprocessor goal makes this slightly different than the parallel,
   real-time GC
 it's really important to make allocation and redirection explicit
   early in the compilation so they are available for optimizations;
   otherwise, the read-barrier is much too expensive
 prospects for MLton: the smaller memory footprint seems really attractive
                      if we ever get around to an GTK or OpenGL interface,
                        long GC pauses might be noticable
                      redirection costs might be high, both space and time
                      Perry noted that the implementation complexity
                        wasn't much higher than other GC (although, given
                        the number of GCs he's written, that might not
                        be a fair analysis); the real issue was getting
                        the compiler to always indirect through the
                        redirection pointer

_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: bandwidth@sourceforge.net
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel