[MLton] Monadic MLton.Vector.create with update

Stephen Weeks MLton@mlton.org
Wed, 29 Mar 2006 13:08:29 -0800


I like the monadic idea that Vesa suggested.  But I think it has some
problems.  First, to make comparison with the non-monadic
Vector.create easier, I reworked the monadic approach to make the
interface look as similar as possible to the non-monadic one.  Here's
what I came up with.

----------------------------------------------------------------------
infix >>=

signature MONAD =
   sig
      type ('a, 'b) t

      val >>= : ('a, 'b) t * ('b -> ('a, 'c) t) -> ('a, 'c) t
      val inject: ('a -> 'b) -> ('a, 'b) t
      val return : 'b -> ('a, 'b) t
      val run: ('a, 'b) t * 'a -> 'b
   end

structure Monad:> MONAD =
   struct
      datatype ('a, 'b) t = T of 'a -> 'b

      val inject = T

      fun run (T f, a) = f a

      fun return x = T (fn _ => x)

      fun m >>= f = T (fn a => run (f (run (m, a)), a))
   end
  
signature CREATE =
   sig
      type 'a z
      type ('a, 'b) m = ('a z, 'b) Monad.t
      val create:
         int * ({sub: int -> ('a, 'a) m, update: int * 'a -> ('a, unit) m}
                -> (int -> ('a, 'a) m) * ('a, unit) m)
         -> 'a vector
   end

functor Create
   (structure Primitive:
       sig
          val safe: bool
          structure Array:
             sig
                val array: int -> 'a array
                val sub: 'a array * int -> 'a
                val update: 'a array * int * 'a -> unit
             end
          structure Int:
             sig
                val geu: int * int -> bool
             end
          structure Vector:
             sig
                val fromArray: 'a array -> 'a vector
             end
       end
       structure Util:
          sig
             val naturalForeach: int * (int -> unit) -> unit
          end):> CREATE =
   struct
      type 'a z = {sub: int -> 'a, update: int * 'a -> unit}
      type ('a, 'b) m = ('a z, 'b) Monad.t

      fun create (n, f) =
          let
             fun make f a = Monad.inject (fn z: 'a z => f z a)
             val (tab, finish) = f {sub = make #sub, update = make #update}
             val a = Primitive.Array.array n
             val lim = ref 0
             fun check i =
                if Primitive.safe andalso Primitive.Int.geu (i, !lim) then
                   raise Subscript
                else
                   ()
             fun sub i = (check i; Primitive.Array.sub (a, i))
             fun update (i, x) = (check i; Primitive.Array.update (a, i, x))
             fun run m = Monad.run (m, {sub = sub, update = update})
             val () =
                Util.naturalForeach
                (n, fn i =>
                 (Primitive.Array.update (a, i, run (tab i));
                  lim := i + 1))
            val () = run finish
          in
             Primitive.Vector.fromArray a
          end
   end
----------------------------------------------------------------------

I abstracted out the Primitive stuff via a functor so this code can be
type checked in the standard basis.

This approach uses a generic Monad structure for representing
arbitrary computations mapping 'a to 'b.  I think it is nice to factor
this out so that one doesn't need any special syntax or structure just
to use Vector.create.  Here is the type of monadic create:

      val create:
         int * ({sub: int -> ('a, 'a) m, update: int * 'a -> ('a, unit) m}
                -> (int -> ('a, 'a) m) * ('a, unit) m)
         -> 'a vector

The idea is that ('a, 'b) m is the type of computations on vectors
with elements of type 'a that return a value of type 'b.  Type m is
just a special instance of the Monad type.

      type ('a, 'b) m = ('a z, 'b) Monad.t

Type z is an abstract type, and its opacity is the essence of why the
trick works.  Client code can only construct computations, and the
hiding of the z type prevents anyone outside the Create functor from
running one of these computations except inside of a Vector.create.

One minor difference between this code and Vesa's is that this code
returns both the tabulator computation and the "whacker" computation
to run after the tabulate is finish.  The same change could easily be
added to Vesa's code.

Here is the type of non-monadic create.

      val create:
         int * ({sub: int -> 'a, update: int * 'a -> unit}
                -> (int -> 'a) * (unit -> unit))
         -> 'a vector

Comparing this type to the monadic create above, we can see that the
only difference is that all of the tabulator functions now produce
computations instead of values.

The implementation of monadic create is very similar to non-monadic
create.  The only difference is the additional closure creation to
delay the application of sub and update until create wants to run the
computation.

A subtle difference between this monadic code and Vesa's is the bounds
checks on subscript and update operations.  This code (as well as the
non-monadic version) dynamically changes the limit as the tabulator
fills in more elements in the array.  Vesa's bakes in the limit to
each monadic operation, allowing manipulation only of lower-indexed
vector elements.  A baked-in limit makes perfect sense, if the entire
computation for the index is going to be done right then, as the
monadic approach guarantees.  However, it also hints at a weakness of the
monadic approach -- it doesn't allow subscripts and updates to occur
after the tabulator is finished.  This seems weaker than the
non-monadic approach.

As far as I can tell, the monadic approach treats subscripts and
updates the same.  That is, just as it disallows updates after the
vector has been created, it also disallows subscripts (in the
tabulator functions, not via Vector.sub, obviously).  This doesn't
work so well in the situation I mentioned earlier, where one wants to
create a vector of promises, where the promises can refer to other
elements in the vector.  For example, suppose we want to make fib
lazy.

   fun lazyfib n =
      Vector.create
      (n,
       fn {sub = fib, ...} =>
       (fn i => (lazy
                 (fn () =>
                  if i <= 1 then i else fib (i - 1) () + fib (i - 2) ())),
        ignore))

(suppose we have the usual val lazy: (unit -> 'a) -> unit -> 'a).

One can make lazyfib work with the monadic approach by extracting the
previous two promises and then building the closure.  But, that is
less natural.  And, more importantly, in general one may not want or
be able to pre-compute which other elements will be needed.

Even worse, one might want to refer to later elements, again with some
laziness involved.  Here's a toy example.

   fun later n =
      Vector.create
      (n,
       fn {sub, ...} =>
       (fn i => lazy (fn () => if i = n - 1 then 0 else 1 + sub (i + 1) ()),
        ignore))

With the monadic approach, one can't lift the "sub (i + 1)" outside
of the "lazy", since the next element isn't defined yet.  Yet one
can't leave it inside either, since the monad requires the entire
computation for index i to be run right then.

All in all, my current feeling is that the monadic approach imposes
too many restrictions on how create can be used.  I also have worries
about the syntactic concision and performance of the monadic approach,
but I'll table those until I understand the expressiveness better.