[MLton] Possible contributions to optimization

Wed Mar 5 21:28:38 PST 2008

On Wed, Mar 5, 2008 at 4:52 AM, Matthew Fluet <fluet at tti-c.org> wrote:
> On Tue, 4 Mar 2008, Kristopher Micinski wrote:
>  > Yes, I intend to by Appel's book, and have already read the Dragon
>  > book and "Modern Compiler Design" (which also includes functional
>  > language compilation, though only a small section). I plan to read the
>  > code in MLton, but want to know if it would be worth implementing a
>  > backend for LLVM (which can then optimize even more with generated
>  > SSA), or if this is just an off the wall idea.
>
>  There was one previous proposal to use LLVM as a backend:
>    http://mlton.org/pipermail/mlton/2005-November/028263.html
>    http://mlton.org/pipermail/mlton/2005-November/028281.html
>  AFAIK, nothing ever came from the proposal.
>
>  The second link above includes some other links to archived messages:
>    (A C-- backend experiment):
>    http://mlton.org/pipermail/mlton/2005-March/026850.html
>    http://mlton.org/pipermail/mlton/2005-March/026884.html
>    (Navtive vs. C codegens):
>    http://mlton.org/pipermail/mlton/2005-June/027143.html
>
>  I suggest reading through those various threads to get a sense of what
>  issues arise when working at the codegen level.
>
>  Writing a backend isn't trivial: you need to combine a lot of knowledge
>  about the low-level ILs used by the MLton compiler, the invariants
>  expected to be maintained by the MLton runtime and garbage collector, etc.
>  And, one inevitably runs into mis-matches between the semantics of MLton's
>  low-level ILs and the target language.  For example, SML demands raising
>  the Overflow exception when integer computations overflow; but, LLVM
>  doesn't provide instructions (or intrinsics) that detect overflow.
>  Similarly, from the above threads, you'll see that MLton allocates the ML
>  stack in the ML heap, doing almost nothing on the C stack.  This means
>  that the LLVM notion of a procedure/function, which expects to manipulate
>  the C stack, doesn't match MLton's (low-level) notion of a
>  procedure/function.
>
>  I don't mean to discourage, but it is very easy to propose large projects
>  which ultimately make little to no progress.  For any one and any project
>  (i.e., not just Kristopher and not just MLton), it seems that a new
>  developer should tackle a relatively small contribution -- one with a high
>  chance of success.  That is a way to develop familiarity with a project,
>  gain some "ownership" (seeing the fruits of one contribution encourages
>  future contributions), and sets the stage for tackling larger
>  contributions.

Indeed, I plan just to become more acquainted with the language and
compilation of functional language over the next few months before
attempting anything major. My first priority is getting access to a
machine with 1gb of ram, as my machine (a pretty reliable IBM
thinkpad) shutdown after reportedly reaching 100 degrees (c) and
swapping a few hundred mb. I also know of C--, but I don't find it as
complete as LLVM (although I may be mistaken on this fact). As I know
little formal functional language compilation theory (or functional
language theory), I don't expect to become a major developer on MLton
unless I stay on the project for a few years. I mostly want to
understand more about compilers and functional languages. If anyone
has any other reading suggestions or ever has any code to be read (I
am aware I should sign up for the commit list if I am seriously
interested in reviewing every change), you may mention it. In any
case, I certainly know what I'm getting myself into; I don't have
unrealistic expectations.

Thanks all,

    -- Kristopher Micinski