tabulators

Matthew Fluet Matthew Fluet <fluet@cs.cornell.edu>
Tue, 5 Feb 2002 16:14:51 -0500 (EST)


> > As a side comment, while we could certainly add mapTo2 to the Vector
> > structure as things exist now, I actually think that from a design
> > stand-point it doesn't belong there.  A quick look at the signature
> > suggests that just about everything there can be done with exactly one
> > pass over the input vector and directly accumulate the result (as in a
> > fold) or simultaneously produce the result vector (using unfoldi).  So,
> > adding something like mapTo2 gives the "illusion" that it is more
> > efficient than it really is.
> 
> I don't understand this reasoning.  What is map2 other than a mild
> generalization of unzip, which is in the signature already?
> 
> fun unzip v = map2 (v, fn z => z)

Agreed that mapTo2 isn't much different than unzip; we currently have a
map2 with signature 'a t * 'b t * ('a * 'b -> 'c) -> 'c t.

And, also agreed, that in my cursory look over the VECTOR signature, unzip
didn't stand out to me, but it does require two passes over the input
vector.

> This would be better than what we currently have, since it would make
> only one pass over the input vector.  map2 also makes one pass over
> each output vector, which doesn't seem illusory to me.

Just to be a little more clear, I'm talking about:

val map : 'a t * ('a -> 'b) -> 'b t
val map2 : 'a t * 'b t * ('a * 'b -> 'c) -> 'c t
val mapTo2 : 'a t * ('a -> 'b * 'c) -> 'b t * 'c t
val map2To2 : 'a t * 'b t * ('a * 'b -> 'c * 'd) -> 'c t * 'd t

which are just instances of the general:
  val map{N}To{M} : ('a1 t * ... * 'a{N} t * 'a -> ('b1 * ... * 'b{M})) ->
                    ('bt t * ... * 'b{M} t)

My comment is only that there is an "asymmetry" in the running times of
these functions when implemented using unfoldi but not when using
construct.  An unfoldi version of map{N}To{M} with M > 1 on
vectors of length n requires one extra pass over a vector of length n
because of the intermediate vector.  (Also, there is an asymmetry in their
implementations.  map2To2 implemented with iterate and construct looks
almost like map, map2, and mapTo2, whereas the diffs between these
functions when implemented with unfoldi is a little larger.)

The other comment is that we have expectations of running times that are
"influenced" by library design.

For example, I was previously being dense about implementing Vector.rev
and bouncing through a list:

> > I don't think so; there is no Vector.rev in the basis library, so you'll
> > end up needing to do a fold over the vector into a list and then
> > Vector.fromList that.  (It's a different story with access to
> > MLton.Vector.unfoldi.)
>
> fun rev (v: 'a vector): 'a vector =
>    let
>       open Vector
>       val n = length v
>    in
>       tabulate (n, fn i => sub (v, n - 1 - i))
>    end

And, Steve is right that this is more efficient, but it relies on his
knowledge of how the MLton instantiation of the Basis Library implements
Vector.tabulate.  Nothing prevents any SML implementation from only
providing compiler support for Vector.fromList and every other vector
construction needs to go through that.  But, the fact that tabulate is in
the Basis Library, we're sort of led to believe that it's a primitive and
therefore has an efficient implementation.

All of the above IMHO.