[MLton] More on Parallel Runtime

Wed Oct 24 22:58:50 PDT 2007

This is just me complaining about the whole idea of  shared memory and 
threads. As you work through the details of a parallel collector the 
complexity is going to drive you nuts. In the end the simplest thing to 
do is what Erlang does. i.e. each thread/process is completely isolated 
from every other thread, then each thread/process can have it's own 
private simple GC. As a side benefit this makes distributing across 
machines much easier.

It would not be hard to enforce the isolation with a simple type-system 
and potentially some whole program analysis.  Read only data can be 
eagerly or lazily deep copied when it go across a threading boundary.  
Optionally, if its linearly used you can just transfer the ownership 
between threads. There should be close to zero shared ('a ) refs between 
threads. For the few that are remaining you can do something clever, 
like add them to the root set of each local collector on GC and ref 
count how many private heaps are keeping it live.

What I'm describing above really takes advantage of the ML programing 
model. ML is not Java/C# so I'm dubious about taking parallel collectors 
designed for languages with a high-mutation rate and bolting those on to 
ML. MLton's whole program optimizations and analysis also open up a 
whole set of opportunities to simplify the parallel GC problem.