[MLton-user] option ref optimized to a null pointer?

Thu Nov 29 16:09:56 PST 2007

On Sat, 24 Nov 2007, Vesa Karvonen wrote:
> On Nov 24, 2007 5:15 PM, Wesley W. Terpstra
> <terpstra at dvs1.informatik.tu-darmstadt.de> wrote:
>> It seems MLton represented the type '(int * int ref) option array' as
>> I would have done it in C. [...]
>>
>> What surprised me is that it seems MLton can turn the NONE case of
>> option + pointer into a "null pointer". [...]
>
> That is what I would expect.  IOW, I believe that it is a very basic
> optimization.  Not all bit patterns correspond to valid pointers, so a
> compiler is free to use invalid patterns to represent other values.
> In particular, when pointers are guaranteed to be aligned by some
> power of two (2^2 or 2^3 in MLton), there are some bits in pointer
> values that can be used to store tag values (and the rest of the bits
> can be used to store other information).

Correct.  In MLton, any datatype that has both value carrying variants and 
non value carrying variants will be represented as a pointer-sized word; 
if the low-bits are 0, then the object is one of the value carrying 
variants; if there is more than one value carrying variant, then tags in 
the header of the carried object distinguish the variant; if the low-bit 
is 1, then the object is one of the non-value carrying variants and 
higher bits are used to distinguish the variant.

As Vesa notes, this is a very basic representation used by any reasonable 
compiler for a language with dataypes.

> Some time ago, while grepping through MLton's commit logs for
> information on some thing, I noticed the following log message, that
> relates to your question:
>
>  http://mlton.org/cgi-bin/viewsvn.cgi?rev=3005&view=rev
>
> It would seem to me that MLton has an expressive framework for
> representation optimizations.

Yes, the low-level compiler ILs offers lots of opportunities to customize 
the representation.

> So, I think that it would probably be
> more productive to approach the problem from the other direction.  If
> you run into cases where MLton is unable to optimize the
> representation as you'd expect, you should report it and maybe MLton
> can be improved to handle the case better.

I'm also wary of making "promises" about how MLton will represent objects. 
A lot can happen to a program during compilation, so it is hard to 
guarantee that types won't be split/flattened/etc in ways that don't 
invalidate the guaranteed representation.  Especially with flattening of 
ref cells into data structures; even if the source program doesn't 
explicitly manipulate the ref, maybe some intervening pass transforms the 
program so that it does, and the ref can no longer be flattened.