[MLton] Transactions for ML

Thu Apr 19 13:48:43 PDT 2007

> I've discussed this briefly with Philip Schatz, who is working on 
> OS-level threads, but I'm looking into the possibility of implementing a 
> transactional memory system for SML.  In particular, I'm interested in a 
> compiler-based STM implementation as opposed to the more typical 
> library-based approaches.

While it may be difficult to adopt a C-library to support STM in SML, 
another option to consider is an SML-library to support STM in SML.  You 
will almost certainly need some compiler support for low-level stuff 
(e.g., compare-and-swap primitives), but I imagine that you could 
implement quite a lot of the STM in SML.  The more stuff that is 
implemented in SML, the less that needs to change in the runtime.

> The first option I discussed with Philip was to insert calls to an 
> existing library implementation, reasoning that they could be replaced 
> with internal implementations at a later time, but this loses due to the 
> fact that any TM system in existence right now uses the C-style stack.

I'm not sure I understand the issue here.  Why do TM systems need to use 
the C-style stack?  And, why does that make it a non-starter for MLton?

> However, I'd like to look into exactly what a compiler can do to 
> implement STM more efficiently anyway, so constructing a framework 
> specifically for MLton SML is something I'm considering.

I'm sure that there are ways that the compiler can help to implement STM 
more efficiently; there's probably also some literature out there that 
addresses this question.

Philip, Lukasz, and Suresh should have some opinions; their ICFP06 paper 
on Stabilizers implemented something like transactions for MLton.

> The main delicate point I see is the interplay between the scheduler, 
> the TM system, and the garbage collector.  I'd like to get some input on 
> the developers as to what the best implementation strategy would be.  
> Specifically, which components should use which others, and how?

At the present time, a MLton compiled program is executed under a single 
OS thread.  You can use that to your advantage (e.g., Ringenburg and 
Grossman's ICFP05 paper on AtomCaml).

That also means that the thread scheduler is part of the ML program; one 
can implement either cooperative threads or preemptive threads (the 
latter uses signals and an interrupt timer).  I wouldn't want to change 
that in the future; I think it makes the most sense to implement the 
scheduler in ML as much as possible.  That would suggest that a TM 
system that interacts with the scheduler would need to do so from ML, 
since the runtime system knows nothing about the scheduler.

Similarly, if the TM system allocates data in the ML heap and uses the 
standard ML object layout, then the garbage collector will handle the 
data just fine.