[MLton] Transactions for ML

Fri Apr 20 00:34:23 PDT 2007

On Fri, 20 Apr 2007, Eric McCorkle wrote:

>
> On Apr 19, 2007, at 8:31 PM, Suresh Jagannathan wrote:
>
>> As Matthew rightly points out, the stabilizer implementation is easily 
>> adapted to support
>> concurrent transactions.  Our current implementation provides an STM 
>> interface that
>> implements a pessimistic write/optimistic read incarnation, although we 
>> have
>> experimented in the past with an optimistic (logging) write version as 
>> well.
>
> McRT papers suggest that read versioning/undo-logging is the best approach. 
> These strategies are much better for compiler optimizations as well.  The 
> biggest problem with undo logging, however, is that you either have to do 
> eager conflict detection on every access, or you have to somehow deal with 
> the possibility of seeing inconsistent data (McRT opts for the former).
>
> I have a list of ideas for how a compiler can do things that a library can't 
> possibly do.  To name a few: generate undo logs as actual code, rather than 
> building a data structure (similar to implementing exceptions with 
> continuations versus unwinding the stack), use knowledge of transaction 
> behavior available at compile time to do things better.
>
> There's also several optimizations one can perform on transactions themselves 
> (most of these require very sophisticated analysis)
>
>> Of course, porting this implementation to a multiprocessor/multi-core 
>> environment is
>> significantly more challenging, requiring modifying the garbage collector, 
>> atomic
>> primitives
>
> The most advanced schedulers, like FreeBSD's ULE scheduler maintain a 
> separate structure for each OS-level thread, and only communicate between 
> schedulers to steal work.  You need a barrier to make sure no one tries to 
> steal work while legitimate scheduling is going on, but aside from that, with 
> lock-free data structures, one could should be able to implement a scheduler 
> with no locks at all.
>
> I know significantly less about garbage collectors.  If I recall, 
> cheng-blelloch (which is the only one I know anything about) claims only to 
> need a barrier and a room, though I'm not sure.
>
> Modern STM systems use locks and memory allocation, and in the worst case can 
> invoke arbitrarily complex contention managers.
>
> My intuition at this point would be to have the scheduler run separate from 
> everything, the garbage collector use locks, and the TM system allocate from 
> the heap and use locks.  The sticking point for me is how to implement the 
> scheduler.  If you do it in C or something else, then communicating with SML 
> efficiently becomes an issue.  However, if you do it in SML, you have to 
> somehow avoid allocating memory while in the scheduler context.  How do you 
> and philip deal with this, or do you organize things differently than what 
> I'm imagining?

If you choose to implement contention management with locks you will have 
to decide if you want strong atomicity (requires some sort of read/write 
barriers on most acceses -- redudant barriers can be eliminated) or weak 
atomicity (less "safe").  I do not believe anyone has thus far done a 
study as to how write buffer compares to updates in place (I assume you 
ment this when you stated you wanted a lock based implementaiton) that 
support strong atomicty. I dont quite follow your worry about allocating
memory in a scheduler context, if the scheduler was implemented in sml.
The scheduler should not be allocating anything that a contention manager
should worry about, all TM managed allocations should be done on behalf of 
the executing thread. Phil has some ideas on how to implement schedulers 
in SML where I do not believe you would run into any problems.

thanks,
Lukasz Ziarek

>
> -- 
> Eric McCorkle
> Brown University
> CS Graduate Student
>
>
>
>