unfold and flattening

Stephen Weeks MLton@sourcelight.com
Wed, 18 Jul 2001 13:36:59 -0700


I pushed unfold into some of the basis library and my library.  Unfortunately,
for Vector.concat, it is currently 3X slower than the old implementation.  Here
is the old one from the basis library.

      fun concat sequences =
	 case sequences of
	    [s] => s
	  | _ => 
	       let
		  val size = List.foldl (fn (s, n) => n +? length s) 0 sequences
		  val a = array size
		  val _ =
		     List.foldl
		     (fn (s, n) =>
		      let
			 val imax = length s
			 fun loop i =
			    if i >= imax
			       then ()
			    else (Array.update (a, n + i, S.sub (s, i))
				  ; loop (i + 1))
			 val _ = loop 0
		      in
			 n + imax
		      end)
		     0 sequences
	       in
		  fromArray a
	       end

Here is the version using unfold.

      fun 'a concat (vs: 'a sequence list): 'a sequence =
	 case vs of
	    [] => fromArray (Primitive.Array.array0 ())
	  | v :: vs' => 
	       let
		  val n = List.foldl (fn (v, s) => s + length v) 0 vs
	       in
		  unfold (n, (0, v, vs'),
			  let
			     fun loop (i, v, vs) =
				if i < length v
				   then (sub (v, i), (i + 1, v, vs))
				else
				   case vs of
				      [] => raise Fail "concat bug"
				    | v :: vs => loop (0, v, vs)
			  in loop
			  end)
	       end

I like the unfold version better.  The problem is caused by a missed flattening
optimization in MLton that causes this version to allocate 5X as much as the old
one.  MLton is unable to flatten the state triple (i, v, s) because it flows
around two loops (the loop above and the one in the definition of unfold) and
thus the tuple construction is not syntactically apparent at the time of
destruction/selection.  We've seen this missed optimization before, and it's
really annoying.  I think I need to build an intraprocedural flattener that is
more aggressive than our interprocedural one.