[MLton] max-heap setting for 64-bit applications

Fri Dec 11 19:56:02 PST 2009

Is GC really sequential in terms of memory access?  Certainly the writes
to new space are, but not the reads from old space, nor  the  writes  to
old space (to set the forwarding pointer).

I have a vague recollection of Macsyma on the VAX (before even my time),
which no doubt used a mark/sweep collector, and they used vadvise to set
random behaviour (I thought) during GC.

With  regards  to the max-heap argument, it would be really good to have
an argument which is the maximum amount of memory MLton can use,  or  as
close  as we can come to that.  If it tries to grow bigger, even if only
temporarily to grow a heap, and can't  grow  it  by  any  other  allowed
means, it should fail with out of memory.

________________________________
From: Wesley W. Terpstra <wesley at terpstra.ca>
To: Matthew Fluet <matthew.fluet at gmail.com>
Cc: mlton at mlton.org
Sent: Fri, December 11, 2009 6:17:34 PM
Subject: Re: [MLton] max-heap setting for 64-bit applications

Stepping outside of the current discussion, a matter of practicality: is the paging really that bad? AFAIK, it is a memcpy running between the two heaps which is sequential access. It shouldn't take longer than it would take to copy a 3GB file on the disk.

If it *is* taking longer, then we need to add a hint to the windows VM to tell it that we will be doing sequential access before the memcpy, then flip it back to random-access mode.

On Fri, Dec 11, 2009 at 9:32 PM, Matthew Fluet <matthew.fluet at gmail.com> wrote:

Right, and the annoying bit is that the previous heap was so close to
>>
>the max heap setting.  Perhaps a reasonable heuristic is that if a
>>desired heap is "close-to" the max-heap size, just round up.  Perhaps
>>0.75 of max heap?  In the max-heap 3G setting, this could still leave
>>you in the situation where you have a 2.25G allocation and a 3G
>>allocation at the same time to copy.  Or 0.55 of max heap; that could
>>require 1.65G+3G at the time of the copy.
>

I would be against yet another special case in the sizing rules. Any cutoff we pick is going to fail for someone else the same way, while artificially restricting the memory growth for others. His problem would be (mostly) fixed if we flipped windows mremap to only move if growth fails. 

I have a higher-level solution proposal: MLton.GC already has hooks that are executed after a GC to implement finalizers. Expose these to the user. If a user knows his application only consumes X memory on an error condition, he can test for this after a GC and terminate with an "Out of Memory" error as desired. 

> It seems like one strategy would be that keepomg the "max-heap"
>>> setting slightly under half the available physical memory should
>>> avoid the case where we're already using about 50% and then have
>>> to create another heap of the same size.
>
>True.  Although, you would really need to use about 50% of the
>>physical memory that you want the MLton process to have access to,
>>else you will page via competition with other processes.
>

I think this isn't quite right. You get paging when the working set of the other applications + heap of MLton program > RAM. If his heap stays small, there is no thrashing. Once it gets big, there wilt be thrashing iff the working set of apps + max-heap > RAM. Keeping the max-heap size warm wouldn't help, only cause more thrashing sooner.

Going to a single
>>contiguous heap, interpreting it as two semi-spaces when using a major
>>copying-collection would be nicer here, because fixed-heap would grab
>>the whole 3.25G up front.
>

Yes.

the mremap function is described as
>using the Linux page table scheme to efficiently change the mapping
>between virtual addresses and (physical) memory pages.  It's purpose is
>to be more efficient than allocating a new map and copying.
>
If I could ...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20091211/a48f953c/attachment.htm