[MLton] max-heap setting for 64-bit applications

Thu Dec 10 12:37:38 PST 2009

> The code below is an example that is supposed to allocate more
> and more memory until it is terminated.  While it is running,  it
> outputs roughly how much memory it has allocated so far.
>

You could run with "@MLton gc-messages --" to see the actual heap sizes as
the program runs.

> If I run the example with "@MLton max-heap 2950M --" run-time
> parameters then it keeps allocating memory,  the numbers printed
> by the example being roughly in sync with what Windows' Task
> Manager shows.   Around 3GB it is terminated with the message
> "Out of memory with max heap size 3,093,299,200",
> which seems like exactly what should happen.
>

You can use "@MLton max-heap 2.95G --" and "@MLton max-heap 3G --" as
shorter forms to express gigabyte sized heaps.  That is, the
{max,fixed}-heap options accept a floating-point number (to express
fractional portions of a size) and the "G"/"g" size modifier.

> However,  if I run the example with "@MLton max-heap 3000M --"
> then around the the same point where it was was stopped before
> (the program reports roughly 3.05GB of allocated memory), my
> system starts thrashing and I can see in the Task Manager
> that the program has allocated around 4.5GB of memory and is still
> trying to get more.
>
> Is this what you would expect?  I thought that the max-heap setting
> would include the amount of additional memory required for garbage
> collection.  Is that true?  Or do I always have to just allow
> half of the system memory for max-heap to make sure to not allocate
> too much memory?
>

You should run your program with "@MLton gc-messages --" to be sure, but I
believe that the behavior you are seeing is explained by the heap resizing
policy.  There is very little space overhead in the garbage collector and
runtime over that consumed by the ML heap.  So, the max-heap option only
bounds the size of the ML heap (which includes the card/cross map used for
generational garbage collection).  MLton will never attempt to create a heap
that is larger than the max-heap option.

However, MLton starts execution with a small heap and allows it to grow in
response to the demands of the live data.  After each garbage collection,
MLton decides whether to grow or shrink the heap in order to have the heap
size approximately 8x the live data at the end of the garbage collection,
further modified by the constraints of the available ram, max-heap, etc.
MLton will never try to *allocate* a heap that is larger than the max-heap
option; however, in order to *obtain* a larger heap than the currently
allocated heap, MLton may be required to allocate the new (bigger) heap and
copy the data from the current heap over.  The idea is that the user would
like their program to continue running; suffering a little bit of paging
during the time that the two heaps are allocated is preferable to aborting
with an out-of-memory error.  After the copy, the old heap is discarded and
total memory usage is again bounded by max-heap.  (Note: This is *not* the
policy for the copying collection; that is, during a copying collection, we
do not have two heaps of max-heap size.  Instead, the semi-space size is
fixed so as to be 1/2 of max-heap; or, if the live data is more than 1/2 the
max-heap, the runtime will use the mark-compact collection.)

Obtaining the larger heap from the old heap actually happens in two ways.
For platforms that support it, we first attempt to mremap the existing heap
to a larger size.  If we are able to expand to at least half-way between the
current size and the desired size, we take that as acceptable growth, even
if it isn't quite the full desired heap size.  If we can't expand, or the
platform doesn't support mremap, then we attempt to create a heap of the
desired size.  This, clearly, results in a second heap that is allocated
while the existing heap is still allocated.  On a 32-bit system, allocating
a second 3G+ heap while one heap of significant size is allocated will fail
due to the virtual address space limitations; in that situation, and if the
program is running with @MLton may-page-heap true --, MLton will attempt to
write out the existing heap to a file, deallocate the existing heap, and now
allocate the large heap.  We hope that by deallocating the existing heap,
there is now sufficient contiguous address space to create the heap of
sufficient size.  On a 64-bit system, there shouldn't ever be any trouble
finding another 3G+ portion of the virtual address space while the existing
heap is allocated.  I suspect that this is what you are seeing.

Note, that while Windows doesn't natively support mremap, Wesley implemented
a few things that try to mimic mmap/mremap/munamp functionality under
Windows.  There seems to be some complication with the Windows memory system
that requires committing in addition to reserving memory, and some manner of
being able to extend an existing map with space before it and after it.  I
don't understand it.  However, I do note that the generic version of mremap
(<src>/runtime/platform/mremap.c), which is used by the Windows platforms,
starts off trying to allocate an entirely new region of the desired size and
copy the existing region into it.  (This actually seems to be redundant with
the behavior of the garbage collector proper.)  The comment states: "Prefer
a moving remap -> it results in less mmapped regions".  Since such an
allocation is likely to never fail on a 64-bit platform, you will (briefly)
have multiple regions allocated at the same time, likely to exceed the
max-heap setting.  Of course, that "(briefly)" is the time it takes to
allocate a 3G region and copy the existing (nearly) 3G region; if you have
<6G physical memory, there will be paging.

As for why things happen differently with the different max-heap settings, I
think that you just experience a slightly different heap growth trajectory.
This is a simple example, but suppose I compare running with fixed-heap 31M
and fixed-heap 33M.  And suppose a very simple heap-size doubling policy,
starting with a heap of size 2M.  So the first run will have heap sizes: 2M,
4M, 8M, 16M, 31M.  In order to transition from 16M to 31M, you might
required 46M memory (the old and new heaps both allocated at the same time),
but if you have enough RAM, you probably won't notice it, because it isn't
too much worse than your desired max-heap.  The second run will have heap
sizes: 2M, 4M, 8M, 16M, 32M, 33M.  Now, to transition from 32M to 33M, you
require 65M memory, and that probably becomes noticeable, since it
significantly exceeds your desired max-heap.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20091210/3fcf9005/attachment.htm