[MLton-user] option ref optimized to a null pointer?

Matthew Fluet fluet at tti-c.org
Thu Nov 29 16:09:56 PST 2007


On Sat, 24 Nov 2007, Vesa Karvonen wrote:
> On Nov 24, 2007 5:15 PM, Wesley W. Terpstra
> <terpstra at dvs1.informatik.tu-darmstadt.de> wrote:
>> It seems MLton represented the type '(int * int ref) option array' as
>> I would have done it in C. [...]
>>
>> What surprised me is that it seems MLton can turn the NONE case of
>> option + pointer into a "null pointer". [...]
>
> That is what I would expect.  IOW, I believe that it is a very basic
> optimization.  Not all bit patterns correspond to valid pointers, so a
> compiler is free to use invalid patterns to represent other values.
> In particular, when pointers are guaranteed to be aligned by some
> power of two (2^2 or 2^3 in MLton), there are some bits in pointer
> values that can be used to store tag values (and the rest of the bits
> can be used to store other information).

Correct.  In MLton, any datatype that has both value carrying variants and 
non value carrying variants will be represented as a pointer-sized word; 
if the low-bits are 0, then the object is one of the value carrying 
variants; if there is more than one value carrying variant, then tags in 
the header of the carried object distinguish the variant; if the low-bit 
is 1, then the object is one of the non-value carrying variants and 
higher bits are used to distinguish the variant.

As Vesa notes, this is a very basic representation used by any reasonable 
compiler for a language with dataypes.

> Some time ago, while grepping through MLton's commit logs for
> information on some thing, I noticed the following log message, that
> relates to your question:
>
>  http://mlton.org/cgi-bin/viewsvn.cgi?rev=3005&view=rev
>
> It would seem to me that MLton has an expressive framework for
> representation optimizations.

Yes, the low-level compiler ILs offers lots of opportunities to customize 
the representation.

> So, I think that it would probably be
> more productive to approach the problem from the other direction.  If
> you run into cases where MLton is unable to optimize the
> representation as you'd expect, you should report it and maybe MLton
> can be improved to handle the case better.

I'm also wary of making "promises" about how MLton will represent objects. 
A lot can happen to a program during compilation, so it is hard to 
guarantee that types won't be split/flattened/etc in ways that don't 
invalidate the guaranteed representation.  Especially with flattening of 
ref cells into data structures; even if the source program doesn't 
explicitly manipulate the ref, maybe some intervening pass transforms the 
program so that it does, and the ref can no longer be flattened.



More information about the MLton-user mailing list