[MLton-devel] common argument optimization

Stephen Weeks MLton@mlton.org
Mon, 31 Mar 2003 12:35:47 -0800


Your description of the three common-argument optimzations and the
analogy with contification makes sense to me.

> I must admit, I was rather suprised at this progression and final
> result.  At the outset, I never would have thought of a connection
> between contification and common argument optimizations.  They would
> seem to be two completely different optimizations.  Although, this may
> not really be the case.  As one of the reviewers of the ICFP paper
> said:
...
>   I believe that making all the continuation arguments explicit will
>   show that the optimization can be generalized to eliminating
>   constant arguments, whether continuations or not.
> 
> What I think the common argument optimization shows is that the
> dominator analysis does slightly better than the reviewer puts it: we
> find more than just constant continuations, we find common
> continuations.

Right.  I remember thinking about this when we got the reviews.  But I
don't think we ended up putting anything in the paper about it.  I
think of the lattice (cont) based approach as more akin to a
traditional analysis to detect constants.

> I'm suspecting that the "real story" is that the dominator analysis
> is a solution to the common argument optimization, and that the
> contification optimization is specializing common argument to the
> case of continuation arguments (with a different transformation at
> the end).

I agree.  These look like basically the same problem and solution to
me.

> (Note, a whole-program, inter-procedural common argument analysis
> doesn't really make sense (in our SSA IL), because the only way of
> passing values between SSA functions is as arguments.

Minor quibble.  The analysis makes sense.  It's just not clear of a
useful transformation based on the analysis.

> Anyways, it's still unclear to me whether or not the dominator based
> approach solves other kinds of problems.  Since Tom Murphy mentioned
> PRE last week and I thought again about my previous investigations
> encountering the problems I mentioned in my reply and I'm struck by a
> potential similarity
...

I'm afraid I don't have any useful thoughts here.  I agree that the
dominator stuff feels like it might be useful in other situations
where an analysis is too syntactic.  I don't know enough about PRE
opts to know if it could help.

> I'm now having second thoughts about placing commonArg before
> flatten and localFlatten3.  I previously didn't put much thought
> into it, but it seems to me that we could have cases where some
> components of a tuple used as an argument are common, but the whole
> tuple isn't.

I'm confused.  Why would commonArg hurt in this case?
 
> Here are the benchmark results.  I believe that the horrid compile
> time for hamlet is due to the dominator computation on the graph G.

Wow.  I'm pretty surprised even with the big graph.  After all,
dominators is basically linear.  Maybe the -diag hurt?

> Finally, notice that wc-input1 got a decent improvement, which is
> encouraging for me because the compiler here is using the new IO,
> while the old IO's hot loop didn't exhibit opportunities for common
> argument elimination.

Cool.  I also see the decent improvement in simple.  Overall, not a
stunning optimization, but certainly worthwhile.

> The benchmarks with the (improved) dominator analysis look like the
> following.  Note that the hamlet benchmark has no slowdown in compile time
> (in fact, simplifications opened up by commonArg resulted in a faster
> (backend and codegen presumably) compile time).  In most cases, using
> commonArg1 or commonArg2 result in identical code, but occassionally
> (notably hamlet, knuth-bendix, and raytrace), using commonArg2 is better
> than commonArg1.
...
> ** Dominator **
> 
> MLton0 -- mlton -drop-pass commonArg1 -drop-pass commonArg2
> MLton1 -- mlton -drop-pass commonArg1
> MLton2 -- mlton -drop-pass commonArg2
...
> run time ratio
> benchmark         MLton1 MLton2
...
> simple              1.00   1.00
...
> wc-input1           1.00   1.00

I am confused.  Where did the speedup in simple and wc-input1 go?
Shouldn't I see ratios <1 for both MLton1 and MLton2 here?


-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb: 
Dedicated Hosting for just $79/mo with 500 GB of bandwidth! 
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel