[MLton-user] Destructive update

Wed, 8 Feb 2006 12:13:07 -0800

Hi Joe.  Here's a few thoughts on your situation.

You have, of course, profiled your program to determine that the
dictionaries are the hot code?  Assuming yes, you may want to try a
trick to get even more precise profiling information.  If you create
your dictionary via a functor, you can instantiate that functor for
each dictionary of interest, and thus get separate profiling
information for separate uses.  That may help to cut down on the
number of candidates for data structure replacement.

For the truly hot uses, you can dynamically measure whether or not a
data structure is used in a single-threaded way, and hence could be
replaced by a mutable data structure.  The idea is to add two extra
fields to (the root of) your immutable structure.  These fields keep
track of the version of the data structure as well as what the
"current" version is.  Operations that would mutate the data structure
(like insert) create a new version and change the current version,
while operations that observe the structure (like peek and foldl)
check that they are dealing with the current version.  If all the
operations deal with the current version, it can be replaced.  Of
course, this isn't a proof, but it will tell you where to look.

Here's an implementation of the idea that I hope makes it clear.

--------------------------------------------------------------------------------
signature IS_SINGLE_THREADED_ARG =
   sig
      structure Data:
         sig
            type t
         end
   end

signature IS_SINGLE_THREADED =
   sig
      include IS_SINGLE_THREADED_ARG

      type t

      val data: t -> Data.t
      val isSingleThreaded: t -> bool
      val make: Data.t -> t
      val mutate: t * Data.t -> t
   end

functor IsSingleThreaded (S: IS_SINGLE_THREADED_ARG): IS_SINGLE_THREADED =
struct

open S

datatype t = T of {current: unit ref ref,
                   data: Data.t,
                   isSingleThreaded: bool ref,
                   me: unit ref}

local
   fun make f (T r) = f r
in
   val isSingleThreaded = ! o (make #isSingleThreaded)
end

fun make (d: Data.t): t =
   let
      val me = ref ()
   in
      T {current = ref me,
         data = d,
         isSingleThreaded = ref true,
         me = me}
   end

fun ensureCurrent (T {current, isSingleThreaded, me, ...}) =
   if me = !current then
      ()
   else
      isSingleThreaded := false

fun data (i as T {data, ...}) = (ensureCurrent i; data)

fun mutate (i as T {current, isSingleThreaded, ...}, d: Data.t) =
   let
      val () = ensureCurrent i
      val me = ref ()
      val () = current := me
   in
      T {current = current,
         data = d,
         isSingleThreaded = isSingleThreaded,
         me = me}
   end

end
--------------------------------------------------------------------------------

I'll also mention that for the data structures you want to replace,
you can leave the functional interface in place so you don't have to
change any of your client code.  Just use a mutable implementation
underneath, wrapped in a (slightly) immutable veneer, and put in
dynamic checks (similar to the version tracking above) to make sure
the data structure is used in a single-threaded way.  If it isn't,
report an error.  That way you can be sure that if a run completes,
the data structure was used correctly.

A couple thoughts on foldl and traversing the elements in order.
First, you can statically check to see if a given use needs foldl,
simply by eliminating the foldl from the signature and seeing if
things type check.  Second, the MLton library has a fast quicksort
that you might want to try.  Depending on the usage, you might be able
to get away with quicksorting the HashSet elements whenever you need
to visit them in order.