tabulators

Stephen Weeks MLton@sourcelight.com
Wed, 23 Jan 2002 15:20:29 -0800


A few days back, Henry wrote the following

> If I understand Stephen's mail correctly, I really REALLY don't like
> the idea of tabulator things that don't pass around a state which
> requires one to use a ref.  The general rule, in my mind, is that if
> some function doesn't get the stack, then it gets a state which is
> passed to it and which it returns (updated).

I've spent some (far too much) time over the last few days thinking
about tabulators, and here are my conclusions.  First, here are seven
different signatures that one might assign to a tabulator function.
S1 is what we currently have.

--------------------------------------------------------------------------------
signature S1 =
   sig
      type 'a t
      val tabulator: int * (('a -> unit) -> unit) -> 'a t
   end

signature S2 =
   sig
      type 'a t
      type 'a state
      val tabulator: int * ({next: 'a * 'a state -> 'a state,
			     start: 'a state} -> 'a state) -> 'a t
   end

signature S3 =
   sig
      type 'a t
      datatype 'a tab = Tab of 'a -> 'a tab
      val tabulator: int * ('a tab -> unit) -> 'a t
   end

signature S4 =
   sig
      type 'a t
      val tabulator: int -> {done: unit -> 'a t,
			     next: 'a -> unit}
   end

signature S5 =
   sig
      type 'a t
      type 'a state
      val tabulator: int -> {done: 'a state -> 'a t,
			     next: 'a * 'a state -> 'a state,
			     start: 'a state}
   end

signature S6 =
   sig
      type 'a t
      datatype 'a tab = Tab of {done: unit -> 'a t,
				next: 'a -> 'a tab}
      val tabulator: int -> 'a tab
   end

signature S7 =
   sig
      type 'a t
      datatype 'a tab = Done of 'a t | More of 'a -> 'a tab
      val tabulator: int -> 'a tab
   end
--------------------------------------------------------------------------------

Next, here are some functors that show how to convert between various
implementations of tabulators.

--------------------------------------------------------------------------------
fun tooMany () = raise Fail "too many"
fun tooFew () = raise Fail "too few"

functor C1to3 (S: S1): S3 =
   struct
      type 'a t = 'a S.t

      datatype 'a tab = Tab of 'a -> 'a tab

      fun 'a tabulator (n, f) =
	 S.tabulator
	 (n, fn next =>
	  let
	     fun t (): 'a tab =
		Tab (fn a => (next a; t ()))
	  in
	     f (t ())
	  end)
   end

functor C2to1 (S: S2): S1 =
   struct
      type 'a t = 'a S.t

      fun tabulator (n, f) =
	 S.tabulator
	 (n, fn {next, start} =>
	  let
	     val r = ref start
	     val _ = f (fn a => r := next (a, !r))
	  in
	     !r
	  end)
   end

functor C2to3 (S: S2): S3 =
   struct
      type 'a t = 'a S.t
	 
      datatype 'a tab = Tab of 'a -> 'a tab

      fun 'a tabulator (n, f) =
	 S.tabulator
	 (n, fn {next, start} =>
	  let
	     val r = ref start
	     fun conv (s: 'a S.state): 'a tab =
		(r := s
		 ; Tab (fn a => conv (next (a, s))))
	     val _ = f (conv start)
	  in
	     !r
	  end)
   end

functor C3to2 (S: S3): S2 =
   struct
      type 'a t = 'a S.t

      type 'a state = 'a S.tab

      fun tabulator (n, f) =
	 S.tabulator
	 (n, fn t =>
	  (f {next = fn (a, S.Tab g) => g a,
	      start = t}
	   ; ()))
   end

functor C4to1 (S: S4): S1 =
   struct
      type 'a t = 'a S.t

      fun tabulator (n, f) =
	 let
	    val {done, next} = S.tabulator n
	    val _ = f next
	 in
	    done ()
	 end
   end

functor C4to5 (S: S4): S5 =
   struct
      type 'a t = 'a S.t

      type 'a state = unit
	 
      fun tabulator n =
	 let
	    val {done, next} = S.tabulator n
	 in
	    {done = done,
	     next = fn (a, ()) => next a,
	     start = ()}
	 end
   end

functor C5to2 (S: S5): S2 =
   struct
      type 'a t = 'a S.t

      type 'a state = 'a S.state

      fun tabulator (n, f) =
	 let
	    val {done, next, start} = S.tabulator n
	 in
	    done (f {next = next, start = start})
	 end
   end

functor C5to4 (S: S5): S4 =
   struct
      type 'a t = 'a S.t

      fun tabulator n =
	 let
	    val {done, next, start} = S.tabulator n
	    val r = ref start
	 in
	    {done = fn () => done (!r),
	     next = fn a => r := next (a, !r)}
	 end
   end

functor C5to6 (S: S5): S6 =
   struct
      type 'a t = 'a S.t

      datatype 'a tab = Tab of {done: unit -> 'a t,
				next: 'a -> 'a tab}

      fun tabulator n =
	 let
	    val {done, next, start} = S.tabulator n
	    fun loop s =
	       Tab {done = fn () => done s,
		    next = fn a => loop (next (a, s))}
	 in
	    loop start
	 end
   end

functor C6to3 (S: S6): S3 =
   struct
      type 'a t = 'a S.t

      datatype 'a tab = Tab of 'a -> 'a tab

      fun tabulator (n, f) =
	 let
	    val r = ref (fn _ => raise Fail "C6to3 bug")
	    fun convert (S.Tab {done, next}) =
	       (r := done
		; Tab (convert o next))
	    val _ = f (convert (S.tabulator n))
	 in
	    !r ()
	 end
   end

functor C6to5 (S: S6): S5 =
   struct
      type 'a t = 'a S.t
      type 'a state = 'a S.tab

      fun tabulator n =
	 {done = fn S.Tab {done, ...} => done (),
	  next = fn (a, S.Tab {next, ...}) => next a,
	  start = S.tabulator n}
   end

functor C7to6 (S: S7): S6 =
   struct
      type 'a t = 'a S.t
      datatype 'a tab = Tab of {done: unit -> 'a t,
				next: 'a -> 'a tab}

      fun tabulator n =
	 let
	    fun conv t =
	       case t of
		  S.Done v => Tab {done = fn () => v,
				   next = fn _ => tooMany ()}
		| S.More f => Tab {done = fn () => tooFew (),
				   next = fn a => conv (f a)}
	 in
	    conv (S.tabulator n)
	 end
   end
--------------------------------------------------------------------------------

Look at the relative power of each of the signatures, where Si is more
powerful than Sj if there is a sequence of functor applications that
transforms Si to Sj.  From this, we see that

S7 is more powerful than
S4, S5, S6, which are equivalent, and are more powerful than
S1, S2, S3, which are equivalent.

I can also argue that the relationships above are strict.  Notice that
S7 can not raise any exception, while S1-S6 can, hence it is more
powerful.  Also notice that S1, S2, and S3 retain some control of the
stack, enough to build the final vector, while S4, S5, and S6 do not
and are hence more powerful.

Because S7 can not raise any exceptions and is more powerful than all
the others, it is, I believe the right interface to use for
Vector.tabulator.  It also has the nice property that it can be
implemented without any side-effects, while hiding the type of its
internal state inside the closure.

Once we have S7, all the rest are easily implemented using the above
functors.  The question is, which others should be exported.  Among
S4, S5, and S6, both S5 and S6 can be implemented without state.  As
the functor conversions show, S5 and S6 are essentially the same, with
the only difference being that S6 hides the state type inside the
closure.  Because of that, I prefer S6.

Among S1, S2, and S3, which are respectively analogues of S4, S5, and
S6, only S2 can be implemented without state.  Unfortunately, the
combination of hiding the state type and wrapping the vector return
requires S3 to use state.  To me, that makes S3 aesthetically
unappealing.  I like S2 because it can be implemented without state,
and I like S1 because it is convenient.

So, I propose to provide S7, S6, S2, and S1.  Now, what names shall I
give them?  Here are my thoughts.

S7	tabulator  
S6	tabulatorDN
S2	tabulatorFold
S1	tabulatorSetEach

Comments on any of the above appreciated.