[MLton-devel] common argument optimization

Mon, 31 Mar 2003 16:50:54 -0500 (EST)

> > I'm now having second thoughts about placing commonArg before
> > flatten and localFlatten3.  I previously didn't put much thought
> > into it, but it seems to me that we could have cases where some
> > components of a tuple used as an argument are common, but the whole
> > tuple isn't.
>
> I'm confused.  Why would commonArg hurt in this case?

It's not that commonArg would hurt, it's just that flatten and
localFlatten3 might expose some common arguments.  E.g

L1 () = [... Z1 = ?  A = (Y,Z1) ...] L3(A)
L2 () = [... Z2 = ?  B = (Y,Z2) ...] L3(B)
L3 (X) = ...

Running localFlatten will expose the fact that Y is a common argument to
L3.  So, putting commonArg after the flatten optimizations catches more
instances of the optimization.

> > Here are the benchmark results.  I believe that the horrid compile
> > time for hamlet is due to the dominator computation on the graph G.
>
> Wow.  I'm pretty surprised even with the big graph.  After all,
> dominators is basically linear.  Maybe the -diag hurt.

Maybe, but I'm pretty sure it was the size of the graph.  Remember, the
naive version was creating a graph with a node for each variable in the
SSA function; that's much bigger than the dominator graph of the basis
blocks.  Also, I wasn't thinning repeated edges.  Anyways, the pass still
takes close to 5 seconds on a self-compile, which is relatively high
compared to many of the other passes.

> > ** Dominator **
> >
> > MLton0 -- mlton -drop-pass commonArg1 -drop-pass commonArg2
> > MLton1 -- mlton -drop-pass commonArg1
> > MLton2 -- mlton -drop-pass commonArg2
> ...
> > run time ratio
> > benchmark         MLton1 MLton2
> ...
> > simple              1.00   1.00
> ...
> > wc-input1           1.00   1.00
>
> I am confused.  Where did the speedup in simple and wc-input1 go?
> Shouldn't I see ratios <1 for both MLton1 and MLton2 here?

The speed up in wc-input1 went away because I wanted to verify that
commonArg wasn't helping the old IO, so I had switched back to the other
implementation for these benchmarks.

I don't know what happened to simple.  I hadn't noticed that
inconsistency.  There was the C calling SML checkin between the two
benchmarks.  That shouldn't have had much of an effect.

-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb: 
Dedicated Hosting for just $79/mo with 500 GB of bandwidth! 
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel