[MLton-devel] D2 problem

Alain Deutsch deutsch@polyspace.com
Wed, 24 Apr 2002 10:41:38 +0200 (MET DST)


On Tue, 23 Apr 2002 mlton-devel-request@lists.sourceforge.net wrote:

> Alain, I just checked in a fix for the out of memory problem you were
> seeing.  Now, MLton attempts to use as much of the address space as it
> can on Cygwin.  Unfortunately, I still have a problem when running
> your test code on Cygwin.  I have included a fragment of the log
> below.  The program dies when attempting to do a GC while a
> 512,000,000 byte array is live.  The error is better :-) in that MLton
> has used as large a portion of the address space as it can.
> Unfortunately, we are limited by address space restrictions on
> Windows/Cygwin.  First, Windows only allows 2G of address space per
> process [0, 2G).  Second, before MLton starts running, various
> portions of the address space have been reserved, leaving only about
> 1.1G of contiguous space available.  Because MLton allocates 1.25
> times the amount of live data space, which comes to 641 million bytes,
> there is only about 480 million available for to space, which is not
> enough.
> I don't see any easy way to work around the address space limitation
> and fragmentation problems on windows with MLton's current approach of
> using two large semispaces.  The only way out that I see is to
> allocate semispaces as collections of smaller chunks (say 10M or
> 100M).  There isn't too much code in MLton that relies on the
> semispaces being contiguous, so this should be feasible, although it's
> not a 1-day fix.  Of course, this also only works if the data fits
> into the smaller chunks.  I'm not sure if your actual code uses such
> large arrays.  I don't see any solution for that on Windows without
> breaking up the array.

 Yes, we occasionally have large arrays (hash tables). But the
problem may probably occur even without them as we do have large
live sets. Not on all runs and typically on a transcient basis.

 My understanding is that the mark-compact GC will significantly
improve the situation. Though even then I understand MLTON will
only be able to work with a live set of slightly less than 1G
(1.1G minus some free space).

If this seems correct to you then I suggest:

1) wrapping up D2 with your last fix. At this point we attempt to
complete the port of our products. We will be limited to smaller
runs.

2) the mark-compact GC will this summer improve the situation by
nearly doubling the live sets we can work with under Cygwin.

3) if there is still a problem then we could at that point decide
to implement the suggestion to work with fragmented spaces.

An idea: when failing to allocate toSpace, copy live data living
in fromSpace to disk, then deallocate fromSpace and read in the
disk file. Given the sequential (two pass) nature of the accesses
to toSpace, the performance may be bad but tolerable. After all
this amount to paging, the only difference being that this is
under our control. In that case I expect much better behaviour
than with a paging MM.

	Alain.
   
--
Alain Deutsch, CTO              tel.: +33 (0)1 49 65 32 64
PolySpace Technologies          fax.: +33 (0)1 49 65 05 77
mailto:deutsch@POLYSPACE.COM



_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel