[MLton] implement _address and _symbol

Wesley W. Terpstra wesley@terpstra.ca
Thu, 21 Jul 2005 15:45:05 +0200


On Tue, Jul 19, 2005 at 08:37:29AM -0700, Stephen Weeks wrote:
> Pinning arbitrary heap objects isn't quite my picture of what we would
> do.  I was thinking there would be a new kind of object, a "pinned"
> value, that would be guaranteed to not move.  A pinned value points
> into the SML heap at a normal heap heap object, but is not in the heap
> itself.  Whenever a GC occurs, the GC updates the pinned pointers, but
> does not move the pinned objects.  Probably pinned objects would be
> malloc'd, possibly we would use our own (very simple) memory manager
> outside the GC.

I like this idea a lot.

So, to make sure I understand, a pin is a pointer into the heap from 
outside of it. When a GC occures, the external pointer gets updated.

If passing a vector to C, the type the function receives is pointer to a 
pointer instead of a normal pointer. The pointer points to the pin which 
lives outside the heap, and in turn points into the heap. C needs to see
this level of indirection, because otherwise the GC could not move the
heap object (a pointer to it lives on the C call stack), which it must.

> Then, the right way to pass heap objects between SML and C would be to
> pass a pinned value pointing to the heap object.  MLton.Pinned would
> have a signature something like
> 
> signature PINNED =
>    sig
>       type 'a t
> 
>       val free: 'a t -> unit
>       val get: 'a t -> 'a
>       val new: 'a -> 'a t
>    end
> 
> where type 'a t opaquely expands to MLton.Pointer.t and 'a is required
> to be a heap pointer.  Pinned.free is necessary because the GC does
> not know who has pointers to pinned objects, and hence does not know
> when to free them.

The way I see it, Pinned.t is (yet another) level of pointers. Here, the
Pinned.t lives on the heap, but points at the pin which lives outside the
heap, which points to the 'a which lives in the heap. This works perfectly
with what I now know of the FFI, since Pinned.t would be internally exactly
the pointer you want to pass to the C function.

I think a reference counter to the external pin might work.
I worry a bit about forming a cycle, though.

Using a datatype, it might be possible to get a pin to keep an object alive
which in turn refered to that pin. So, probably explicit free is good.

> > I'm not talking about initializing it for C, but SML.
> > Here the expectation is that the variables are always 'initialized'.
> > 
> > If the symbol came from C, then it is statically initialized before main().
> > If the symbol came from SML, the best we can do is initialize it before
> > it can be referenced inside the SML code.
> 
> I still don't get the point.  We're defining a symbol so it can be
> accessed from C.  Perhaps C wants to initialize it as well.

Well, my point of view is that the object which provides a symbol is the one
which is supposed to initialize it. If C wants to initialize a symbol, then
C should also be the one exporting that symbol.

BTW, I think exporting symbols makes a lot of sense if MLton is being used
to create a shared library (if that's ever merged in).

-- 
Wesley W. Terpstra