[MLton] MLton.Vector.create

Stephen Weeks MLton@mlton.org
Tue, 28 Mar 2006 12:54:41 -0800


> Re tabulate vs. a single initial value, I agree that I almost always
> tend to use the former, counting on MLton being able to optimize
> away the overhead in the constant case.

Certainly that will happen.  In fact, Array.array is implemented using
Array.tabulate.

> On the other hand, having tabulate have access to sub and
> update makes me nervous for some reason.
>
> I think that the big problem is that what is legal and what isn't
> becomes MUCH less syntactically clear (although I agree that it is
> still well defined).  Before, I just couldn't stash away the update
> function.  That is usually trivial to see syntactically.  Now I have
> to think about what indices I am allowed to use.

I agree.  But I think the extra power is useful in a number of
situations, including the dynamic programming ones you are thinking
of.  I especilally like the fact that giving the tabulate function
access to sub allows for purely functional constructions.  E.g.,
here's the dynamic-programming implementation of fibonacci numbers,
without using any (explicit) mutation.

local
   open MLton.Vector
in
   fun fib n =
      Vector.create (n, fn {sub = fib, ...} => fn i =>
                     if i <= 1 then i else fib (i - 1) + fib (i - 2))
end

> Another question: why is the tabulator curried?  I guess one answer
> would be that just like the final folder wants a hook to be called
> last, the curried tabulator gives you a hook for something to be
> called first (and establish a scope as well).

Yeah, but that argument doesn't really fly, since one can always do
what one needs before calling create.  I think I was trying to capture
the fact that the tabulator function is applied to the sub and update
function once, and the result is then applied to each index in the
array.  But I don't see how that could be useful.  I think the
currying should go.

> if I want to do what the old one did, I now have to test in the
> second argument for getting the last value as the curried int arg
> (since that is my only chance to whack away with a fully initialized
> array/vector).

Yeah, and it's even more annoying because one isn't allowed to
sub/update the last element.

> I think that even if you want to give the power to access the sub
> and update functions in the tabulator, you still want to have a
> separate functional argument, of type
>     { sub: int -> 'a, update: int * 'a -> unit } -> unit
> which gets applied just before returning the now-converted-to-vector
> result. 

I agree.  How about the following?

      val create:
         int
         * (int * {sub: int -> 'a, update: int * 'a -> unit} -> 'a)
         * ({sub: int -> 'a, update: int * 'a -> unit} -> unit)
         -> 'a vector

      fun create (n, f, g) =
         let
            val a = Primitive.Array.array n
            val subLim = ref 0
            fun sub i =
               if Primitive.safe andalso Primitive.Int.geu (i, !subLim) then
                  raise Subscript
               else
                  Primitive.Array.sub (a, i)
            val updateLim = ref 0
            fun update (i, x) =
               if Primitive.safe andalso Primitive.Int.geu (i, !updateLim) then
                  raise Subscript
               else
                  Primitive.Array.update (a, i, x)
            val su = {sub = sub, update = update}
            val () =
               Util.naturalForeach
               (n, fn i =>
                (Primitive.Array.update (a, i, f (i, su));
                 subLim := i + 1;
                 updateLim := i + 1))
            val () = g su
            val () = updateLim := 0
         in
            fromArray a
         end

Here's a different interface that only passes the sub and update
functions once, yielding the tabulator function and the finished
hook.

      val create:
         int
         * ({sub: int -> 'a, update: int * 'a -> unit}
            -> (int -> 'a) * (unit -> unit))
         -> 'a vector

But I think the other one is preferable since it makes the dependence
of the tabulator and finisher on sub and update syntactically clearer.