[MLton-user] FFI, MacOS PPC, and 16-bit ints

Tue Nov 27 13:37:55 PST 2007

On Nov 27, 2007 10:02 PM, Matthew Fluet <fluet at tti-c.org> wrote:
[...]
> It is a bug in MLton, failing to account for a big-endian target.  All
> objects in the heap are padded for alignment purposes, so an Int16.int ref
> is represented as a pointer to a 32-bit heap object, 16 bits of which
> correspond to the Int16.int and 16 bits of which are (ignored) junk.
> MLton does this packing/padding in a manner that is independent of the
> endianness of the target platform.  This has the unfortunate consequence
> that on a big endian platform, the int16_t* that C sees points to the
> 16-bits of junk, not the 16-bits of Int16.int.
[...]

Browsing through the code, it would seem that the layout is decided in
the make function starting at line 979 in packed-representation.fun.
It would also seem (and seems plausible to me) that a ref is compiled
to a tuple object with a single mutable element (unless the ref gets
flattened to a bigger tuple).  Are these observations correct?

Thinking about this (haven't yet read the whole make function), I
don't quite understand why endianness should matter.  I would assume
that the pointer passed to C is the same pointer value as used in the
ML world to point to the ref cell.  It would seem to me that MLton
currently creates a different layout depending on endianness.  On a
little-endian arch, the padding is added after the used bits, while on
a big-endian arch, the padding is added before the used bits.  Is this
correct?  Is there a reason for this?  (I don't see a reason to do
that.)

I noted earlier that the computed layout, as described in the comment
before the function, is troublesome on some architectures (not
currently supported by MLton).  Perhaps now would be a good time to
change it to compute a smarter layout.

-Vesa Karvonen