[MLton] regions

Stephen Weeks MLton@mlton.org
Fri, 1 Oct 2004 15:58:38 -0700


I may be too biased by so much successful use of the notions of
liveness and space safety to see much else.  But here goes ...

> The whole notion of space safety is basically a specification for
> when a GC ought to collect data. All automatic storage management
> systems are conservative in some way.

Agreed.

> However, I could imagine a system which is not strictly safe for 
> space but at the same time useful from a practical standpoint.
...
> Space safety is just one of many arbitrary properties for storage
> recycling.

I agree that you have more flexibility in choosing how to do storage
management if if you're designing a language and are allowed to change
it and the type system.  But once you fix a language, things are not
so arbitrary.  The key point is not just to come up with a
specification for when a GC ought to collect data.  The system must
also:

1. be practical
2. be understandable to the programmer in terms of a source program in
   the language 
3. provide the programmer with a way to limit memory usage in the 
   case that automation fails.

> I think it's fair to say that regions are often conservative in a
> more annoying ways than a space safe GC.

Right.  This is one reason why I think the choice between GC and
regions is not arbitrary, if one wants a storage management system for
an SML compiler.  Regions fail my criterion (1).

Regions also fail my criterion (2).  They are difficult for a
programmer to understand in terms of the source language.  Why?
Because regions use a whole-program analysis to produce annotations to
the program.  On the other hand space safety is defined by a local
syntactic condition.  One can give rules that define at each point in
the source program what variables are live, and that is all a
programmer needs to understand.

Regions also fail my criterion (3).  What can one do if something ends
up in the global region (either by necessity or by conservatism in the
analysis)?  With space safety, one can rewrite the code so that the
data is no longer live.

I'm all for whole-program analysis :-).  But not when it is common to
feed the output to the programmer so he can understand the performance
of his program.