[MLton] POPL review

Matthew Fluet fluet@cs.cornell.edu
Thu, 19 Jan 2006 10:03:54 -0500 (EST)


I thought I would pass along a few impressions from POPL that would be of 
interest to the MLton list.

* Invited Talk: The Scala Experiment -- Can We Provide Better Language 
Support for Component Systems? (M. Odersky)
http://www.cs.princeton.edu/~dpw/popl/06/odersky.pdf

The Scala language (http://scala.epfl.ch) is one of these multi-paradigm 
languages that attempts to combine OO with functional, with a dose of 
native XML support.  But, Martin's real focus in the invited talk was 
dealing with component systems -- how do you build complex software 
systems that are robust in the face of changing out one componet for 
another.  While there is a fairly expressive type system living in the 
language, I'm not entirely sold on whether or not this leads to better 
software engineering.


* Formal Certification of a Compiler Back-end or: Programming a Compiler 
with a Proof Assistant (X. Leroy)

Xavier's herculean accomplishment of getting a formal certification for 
the compilation of Cminor (a stripped down C) to PowerPC assembly.  What I 
find most compelling about this work is that it _doesn't_ follow the 
"proof by construction" paradigm that lies at the heart of constructive 
type theories, such as supports the Coq proof assistant.  Rather than 
proving \forall\exists properties and extracting functions, Xavier simply 
wrote down the functions he was interested in and proved they had the 
required properties.  And, where that proved (pardon the pun) too 
difficult, he resorted to proving, for example, that a _validation 
function_ could assert that a given register allocation did indeed color 
the interference graph correctly, without proving that the George-Appel 
register allocator itself was correct.  This gives you more conditional 
theorems (if the compiler produces assembly, then it is semantically 
equivalent to the source --- but no guarantees that the compiler actualy 
succeeds in producing assembly).


* Environment Analysis via \Delta-CFA (M. Might, O. Shivers)

As I said in my message to Ben Chambers, this was one of the smoothest 
talks I saw.  The Super-Beta inlining optimization requires that the 
call-site have the same dynamic environment as the closure to be inlined. 
This environment analysis is fundamentally different than the 
control-flow analysis of k-CFA.  An open question for me is whether this 
kind of analysis might be useful for guiding better closure 
representations in MLton.


* Invited Talk: The Next Mainstream Programming Language: A Game 
Developer's Perspective (T. Sweeney)
http://www.cs.princeton.edu/~dpw/popl/06/Tim-POPL.ppt

I think the big take away message is that academia doesn't give industry 
enough credit.  In particular, programmers are not blind to the 
complexities and predisposition to bugs inherent in C++ programming. 
Also, programmers may be a lot more willing to swallow a radical change in 
programming language if it really improves productivity.


* A Verifiable SSA Program Representation for Aggressive Compiler 
Optimization (V.S. Menon, N. Glew, B.R. Murphy, A. McCeight, T. Shpeisman, 
A.-R. Adl-Tabatabai, L. Petersen)

This is in the context of the STARJit compiler.  The big goal is to 
optimize the redundant checks that are left by naively translating Java 
bytecode into a more explicit representation on the way to native code. 
They adopt an SSA representation that is very much like MLton's IL (except 
they continue to use \phi-functions and all the complexities that 
entails); they add on proof terms and proof types that validate particular 
unsafe acceses (like array dereference).  The big insight is that by 
reifying proof terms as SSA variables, then data-dependence ensures that 
all the supporting checks remain in one form or another.  The other 
advantage is that most optimization passes need not know anything specific 
about proof terms -- they treat them just like SSA variables.


* Staged Alocation: A Compositional Technique for Specifying and 
Implementing Procedure Calling Conventions (R. Olinsky, C. Lindig, N. 
Ramsey)

Implementing calling conventions is hard and error-prone.  This work gives 
a simple declarative description of calling conventions that can be broken 
down in to simple primitives, easily implemented and composed in a 
compiler.  An interesting point is that rather than finding the 
most-expressive language for describing calling conventions, they focus on 
primitives that give rise to sensible calling conventions.

===========================================================================
===========================================================================

That was POPL formally, but, of course, there were a lot of interesting 
discussions outside the technical programme.

Martin Elsman told me that the MLKit is gearing up for a new release with 
both full Standard Basis Library support and ML Basis file support.  He 
also told me that he's been able to bootstrap the MLKit with MLton, seeing 
a decent speed up of the resulting compiler.  Unfortunately, the MLton 
compiled version of the MLKit weighs in at 100MB (75MB stripped), which is 
quite a blowup.  I'm hoping we can investigate.

John Reppy and Aaron Turon told me a bit about what is going on at 
UChicago.  They are preparing to release an ML-Lex replacement, based on 
Brzozowski's notion of regular expression derivatives.  They hope to go on 
to develop an ML-Yacc replacement (probably along the lines of ANTLR). 
They have also been working on SML support in Eclipse.  This is part of an 
SML infrastructure grant, so it ought to yield some nice artifacts for the 
community at large.

Umut Acar and Matthew Hammer from TTI-Chicago gave a 5-min madness talk at 
SPACE, where they wondered how to get better performance out of their 
self-adjusting computation infrastructure 
(http://ttic.uchicago.edu/~umut/sting/).  Their fundamental problem is 
that their library allocates data contrary to the generational hypothesis.
I suggested that it might be possible to expose hooks like:

   val MLton.GC.allocInOldGen : (unit -> 'a) -> 'a
   val MLton.GC.allocInNursery : (unit -> 'a) -> 'a

that would cause allocations during the run of the thunk to happen in the 
appropriate area of the heap.