[MLton] Memory problems with latest CVS?

Matthew Fluet fluet@cs.cornell.edu
Fri, 3 Sep 2004 19:29:05 -0400 (EDT)


> I just did a bootstrap from clean sources on a 1G Cygwin machine with
> no problems.  There were definitely mark-compact GC's going on.  The
> bootstrap compiler was MLton 20040227.  When I tried to bootstrap
> using MLton 20040819, I encountered a gc performance problem similar
> to what Matthew and Brent saw.  That performance problem was in the
> second round of self compilation, so it is either due to a bug in
> 20040819 and its runtime or a space leak in the CVS head.  My money's
> on the former.

My guess is that it is a runtime/GC problem, but I'm not ruling anything
out.  I've been seeing some truly bizzare behavior.

Here's a few things I've observed, but I don't know where to look next.
For lack of a better term, I'll call it a space leak.  Essentially,
watching a self-compile with gc-messages, I'll observe a situation where
the old gen size will climb to 99%.  Certainly the pattern of memory usage
does not look anything like the gc-messages from mlton-20040227.  I've
rigged things so that the compiler is linked with mlton-gdb.a, and
observed the same behavior without seeing any assertion failures.

I've currently got a very strange situation.  I have two identical source
trees; the only difference between them is the path names.  One is at
~/mlton/mlton.cvs.HEAD and the other is at ~/mlton/mlton.cvs.HEAD.buildee.
The former compiles without problem with:
mlton @MLton gc-summary gc-messages fixed-heap 640M -- -verbose 3 mlton.cm
The latter exhausts memory with
mlton @MLton gc-summary gc-messages fixed-heap 780M -- -verbose 3 mlton.cm
How an extra 8 bytes in the path names can require more than an additional
140M of heap is beyond me!

I don't see anyway of telling the runtime to never use mark-compact,
although that is an experiment I'd want to try.  I tried bootstrapping
from mlton-20040227 with the second round compiling with -drop-pass
refFlatten -drop-pass deepFlatten, and I don't seem to see the same
problems with further self-compiles, but I haven't tested things as
extensively in this situation.  I don't know if that indicates a
flattening bug or only that it results in significantly different heap
layouts to no trigger the GC bug.