Converting GC

Matthew Fluet fluet@research.nj.nec.com
Thu, 13 Jul 2000 17:17:53 -0400 (EDT)


It doesn't look like the glue between the native backend and the runtime
system will be that difficult.  Right now I'm generating both an assembly
file and a small C-stub file.  The C-stub uses a "native.h" header instead
of "machine.h" header, the Globals, IntInfs, String, Main, and MLton_halt
macros are pretty much the same.  Exceptions are that Main requires on
less argument -- no chunk label.  Also, different trampolining code.  I
debated about whether or not to do inline assembly in the C-stub to jump
to the address in gcState.stackTop, but I thought it safer to let it be a
pure C-file.  Instead it just calls a hard-coded assembly function which
does saves %esp (the "C stack") and sets up %esp and %ebp to be the MLton
stackTop and frontier, then jumps to whatever is in the stack top.  We'll
never return from this function, but that should be o.k.

Anyways, that seems to be enough to create an executable.  There are some
hoops to jump through right now to get it all to compile (particularly for
debugging purposes), but nothing that can't be cleaned up in time.

Not surpisingly, the executable immediately seg-faults.  But, I'm happy to
say that it's failing in the GC -- the issue is that I don't set up the
frameLayouts structure.  Dumping that structure is no problem (we can
probably use the identical code from c-codegen), but the fundamental issue
is that frameLayouts[stackTop] doesn't make sense anymore (the return
label is now an address).  Unfortunately, we end up needing this
information at every GC_gc (in partular, I'm dying in the function
stackTopIsOk which is called at the beginning of every GC).

So, we've got a few choices.  I'm starting to think that its going to be
impossible ----

I was going to say, impossible to run anything without having a GC.  But,
just to be sure, I replaced every call in the assembly to GC_gc to
GC_gc_nop, and tried that.  Now this is surprising -- we don't immediately
seg-fault.  In fact, we can correctly calculate the 20th Fibonnaci number,
use the printf foreign function to display the result, and get the correct
return result from printf.  We do eventually seg-fault during GC_done.  I
think this is due to "stack overflow" in the sense that when every GC_gc
is a nop, then the program blindly runs beyond the end of the stack that
the GC thinks its going to use.  By artifically increasing the default
stack allocation (by using a maxFrameSize of 10 times the real
maxFrameSize), I can eliminate the seg-fault.

So, that's the good news.  I'm going to play around with this a little
bit (fact and primes aren't working yet, so obviously there are some bugs
out there), but I think we're in good shape.  As I was saying before,
we've got a few choices for converting the GC.  It would probably be nice
to keep identical runtime systems for the two codegens.  We can probably
use the same frameLayouts structure, and just change the accesses to it.
I only count two that would really need to be changed.  We'd just have to
replace frameLayout[exp] with frameLayout[getIndex(exp)] where getIndex
branches on some value which tells whether or not we're using the native
backend.  If not, just return the value of the expression.  If so, then
convert the value of the expression (which should be a label address) into
it's corresponding index, probably via a hash table.  

Actually, there's no reason we can't use the hash table structure for both
backends, unless you think it would be another performance hit for the
C-backend.