[MLton] Ongoing parallel runtime work

Eric McCorkle ericmcc at cs.brown.edu
Tue Jan 22 18:18:51 PST 2008


On Jan 22, 2008, at 5:04 PM, Vesa Karvonen wrote:
> I don't know how the double pointers would actually be used and
> whether a single object may be accessed in parallel by multiple cores

My work is based on the Cheng-Blelloch collector, but is lock-free  
(except for a barrier at the beginning and end of a collection).   
Like Cheng-Blelloch, the double pointers are necessary for garbage  
collecting some, but not all of the heap.  Stopping and switching  
over all the pointers incurs an unbounded pause.  If you collect the  
entire heap, then you can build the "new" memory graph using pointers  
to the new objects, and continue to have mutators access the old  
memory graph until collection is complete, then switch over only a  
very small number of pointers at the end of collection.  However,  
global variables don't get copied, and there can be a large number of  
them.  Also, with generations, there might be a large number of  
objects which aren't copied, and thus, still have pointers to either  
the old graph, which vanishes, or to the new one, which isn't ready  
until the end of collection.  Switching over all the pointers has to  
happen very quickly.

The solution to this is double-pointers.  Pointers consist of two  
slots.  For objects that aren't copied, like static data, a different  
generation, or an exiting thread, the collector stores the pointer to  
the new object in the unused slot.  At the end of collection, a flag  
is toggled, causing the program to switch which slot is used.  It  
should be noted that you pay this cost, as well as the cost of the  
write-barrier for the benefit of parallel collection.

> One thought that comes to mind here is that it
> might make sense to layout objects in such a way that all the pointers
> used by the collector are stored as a contiguous block (within an
> object) and likewise for the pointers used by the mutator.
>

MLton already lays things out this way.  Pointers are all in a  
cluster, which follows the non-pointer data.  However, what I was  
talking about was the object's header itself.  In the parallel  
collector, the header must contain a forwarding pointer (the pointer  
to the new object), which allows writes to be copied to the new  
object (the write barrier, which I won't explain right now, mostly  
because it's one of the things I still have to implement).  The  
collectors themselves "claim" a given object by CASing in an actual  
value to the forwarding pointer slot.  This prevents an object from  
being processed twice.  However, as the object header is now subject  
to a CAS, it need to be a cache-line in size so as to avoid data  
fighting (other unrelated data getting cache-invalidated due to CASing).

-- 
Eric McCorkle
Brown University
CS Graduate Student


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20080122/e542eda6/attachment.htm


More information about the MLton mailing list