[MLton] Re: [MLton-commit] r5678

Vesa Karvonen vesa.a.j.k at gmail.com
Thu Sep 20 02:48:04 PDT 2007


[I noticed the comment in this commit log message a long time ago, but
didn't get around to commenting on it until now.]

> --- mlton/trunk/mlton/backend/packed-representation.fun 2007-06-26 06:15:05 UTC (rev 5677)
> +++ mlton/trunk/mlton/backend/packed-representation.fun 2007-06-27 00:11:16 UTC (rev 5678)
> @@ -968,7 +968,7 @@
>        (* TupleRep.make decides how to layout a sequence of types in an object,
>         * or in the case of a vector, in a vector element.
>         * Vectors are treated slightly specially because we don't require element
> -       * widths to be a multiple of the word size.
> +       * widths to be a multiple of the word32 size.
>         * At the front of the object, we place all the word64s, followed by
>         * all the word32s.  Then, we pack in all the types that are smaller than a
>         * word32.  This is done by packing in a sequence of words, greedily,

I'd just like to note that on some CPUs the above scheme results in
rather bad record layouts.  More specifically, on some CPUs (IIRC,
e.g. Hitachi SH-4) the register+immediate addressing mode is limited
to small (e.g. 4-bit) immediate offsets (possibly) scaled by the
operand size.  On such a CPU, packing the widest fields to the front
means that only the first few fields can be accessed with a single
instruction.  Narrow fields at the end will be beyond the reach of the
small immediate offset.  I think that a better scheme would be to
attempt to coalesce narrow fields into wider (aligned) fields (e.g.
1+1+2 -> 4 bytes) and put the coalesced fields to the front as long as
alignment restrictions are respected and the aligned size of the whole
record does not increase.  The point is to maximize the number of
fields that can be accessed with a single instruction.

-Vesa Karvonen



More information about the MLton mailing list