[MLton] Vector.fromList[] performance

Stephen Weeks sweeks at sweeks.com
Mon Oct 30 15:37:05 PST 2006


> 1. Part of the deal is that mlton thinks it needs to heap-allocate the
> list and vector.  Ideally, mlton would not: after inlining
> Vector.fromList, it can discover from the SSA (or, really, any basic
> block form) that there's no pointer to the vector or list that is ever
> leaked from the "vectors" function
...
> Given the SSA representation already computed, it seems that this
> should be a fairly self-contained project: just adding another
> compiler pass that comes after all other analysis-heavy passes.

Yes.  That's the conclusion we came to.  It's not a trivial pass, and
it's not quite like anything that MLton currently does.  But it should
be doable.

> It's too bad I already did my compilers class, but I might be able
> to pitch it to some of the younger PL students next semester.

Please do.  We're open to help.

> 2. It'd be great if mlton could entirely eliminate the construction of
> the vector and list here.  In my actual code, where I create
> vectors/lists is often when two parts of the code that store
> everything as tuples communicate via a signature that specifies the
> use of a vector or list.

This code might benefit from the sum-type approach that I proposed
then, since that will turn small vectors into tuples, which MLton will
optimize away when it can.

>   fun inSimplex (points, p) =
>     case length points of
>       3 => (* break up the list into 6 reals and call
> ExactPredicates.orient2 thrice *)
>     | _ => raise WrongDimension
...
> Theoretically it should be possible to inline Geo2d.inSimplex into
> Triangulation.inSimplex and thus eliminate the conversion to a list,
> along with the case statement. 

MLton won't get this code as written because of the call to the
recursive "length" function, which ends up as a loop.   MLton's
optimizer doesn't have the smarts to completely unroll terminating
loops.  On the other hand, if you write inSimplex as

  fun inSimplex (points, p) =
     case points of
        [p1, p2, p3] => ...
      | _ => raise WrongDimension

then it's quite possible that MLton will get it, because the code
generated by the match-compiler for "[p1, p2, p3]" will get simplified
away when supplied with a constant list.

And, if you use the sum-type approach, you could have special
destructors for small vectors, so you could write

  fun inSimplex (points, p) =
     case Vector.dest3 points of
        (p1, p2, p3) => ...
      | _ => raise WrongDimension

MLton has a good chance at getting this too.

> On the other hand, this is evidence of this hard project being
> worthwhile.

Agreed.



More information about the MLton mailing list