[MLton] Transactions for ML
ericmcc at cs.brown.edu
Thu Apr 19 22:39:43 PDT 2007
On Apr 19, 2007, at 8:31 PM, Suresh Jagannathan wrote:
> As Matthew rightly points out, the stabilizer implementation is
> easily adapted to support
> concurrent transactions. Our current implementation provides an
> STM interface that
> implements a pessimistic write/optimistic read incarnation,
> although we have
> experimented in the past with an optimistic (logging) write version
> as well.
McRT papers suggest that read versioning/undo-logging is the best
approach. These strategies are much better for compiler
optimizations as well. The biggest problem with undo logging,
however, is that you either have to do eager conflict detection on
every access, or you have to somehow deal with the possibility of
seeing inconsistent data (McRT opts for the former).
I have a list of ideas for how a compiler can do things that a
library can't possibly do. To name a few: generate undo logs as
actual code, rather than building a data structure (similar to
implementing exceptions with continuations versus unwinding the
stack), use knowledge of transaction behavior available at compile
time to do things better.
There's also several optimizations one can perform on transactions
themselves (most of these require very sophisticated analysis)
> Of course, porting this implementation to a multiprocessor/multi-
> core environment is
> significantly more challenging, requiring modifying the garbage
> collector, atomic
The most advanced schedulers, like FreeBSD's ULE scheduler maintain a
separate structure for each OS-level thread, and only communicate
between schedulers to steal work. You need a barrier to make sure no
one tries to steal work while legitimate scheduling is going on, but
aside from that, with lock-free data structures, one could should be
able to implement a scheduler with no locks at all.
I know significantly less about garbage collectors. If I recall,
cheng-blelloch (which is the only one I know anything about) claims
only to need a barrier and a room, though I'm not sure.
Modern STM systems use locks and memory allocation, and in the worst
case can invoke arbitrarily complex contention managers.
My intuition at this point would be to have the scheduler run
separate from everything, the garbage collector use locks, and the TM
system allocate from the heap and use locks. The sticking point for
me is how to implement the scheduler. If you do it in C or something
else, then communicating with SML efficiently becomes an issue.
However, if you do it in SML, you have to somehow avoid allocating
memory while in the scheduler context. How do you and philip deal
with this, or do you organize things differently than what I'm
CS Graduate Student
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the MLton