big arrays [was Re: bug found]

Matthew Fluet Matthew Fluet <fluet@CS.Cornell.EDU>
Tue, 27 Nov 2001 16:55:49 -0500 (EST)


> I just talked to Suresh and he asked me why we couldn't grab a bit from the
> header word saying that this was an array, and then have header followed by
> length (followed by data).  It sounded like it would be good to me.

I thought of that too, switching the header and length words.  I think the
problem is that some things (ML code and some GC stuff) pass around
pointers to the object itself and some things (GC stuff) pass round
pointers to the header.  For things of the first type, you need to know
where the header is relative to the pointer; in the current world, it's
the previous word.  If you switch them, you're fine in ML, because types
told us when a pointer was a pointer to an array, so we can fetch the
length or header.  But, in the GC, we don't know types, so having a
pointer to the actual object, we need to know how to find the header word. 

We could get that to work out.  Normal headers have a high bit of one,
array lengths must have a high bit of zero (so, arrays of <2^30 elements),
so that by looking at the previous word, we can distinguish between an
array length and a header word.  If it is an array length, we need to go
back one additional word.  Now, a normal header still needs to distinguish
between arrays and other objects (because of those things that pass around
pointers to headers), but we can do that with a bit, without impacting the
maximum number of array elements.