[MLton] Constant folding vector expressions

Vesa Karvonen vesa.a.j.k at gmail.com
Sun Sep 16 23:08:44 PDT 2007


Is there some reason why vector expressions are not subject to
constant folding? I think that it would nice to be able to, for
example, hash strings in compile-time.   I wrote the below (probably
slightly incorrect) patch to enable constant folding of vector
expressions:

Index: mlton/atoms/prim.fun
===================================================================
--- mlton/atoms/prim.fun	(revision 6026)
+++ mlton/atoms/prim.fun	(working copy)
@@ -1303,6 +1303,14 @@
                     then null
                  else ApplyResult.Unknown
            | (CPointer_toWord, [Null]) => word (WordX.zero
(WordSize.cpointer ()))
+           | (Vector_length, [WordVector v]) =>
+             ApplyResult.Const
+                (Const.word
+                    (WordX.fromIntInf
+                        (IntInf.fromInt (WordXVector.length v),
+                         WordSize.cint ())))
+           | (Vector_sub, [WordVector v, Word i]) =>
+             word (WordXVector.sub (v, WordX.toInt i))
            | (Word_add _, [Word w1, Word w2]) => word (WordX.add (w1, w2))
            | (Word_addCheck s, [Word w1, Word w2]) => wcheck (op +, s, w1, w2)
            | (Word_andb _, [Word w1, Word w2]) => word (WordX.andb (w1, w2))
Index: mlton/atoms/const.sig
===================================================================
--- mlton/atoms/const.sig	(revision 6026)
+++ mlton/atoms/const.sig	(working copy)
@@ -13,6 +13,7 @@
       structure RealX: REAL_X
       structure WordX: WORD_X
       structure WordXVector: WORD_X_VECTOR
+      sharing WordX = WordXVector.WordX
    end

 signature CONST =

The probably incorrect part is "WordSize.cint ()".  The code shouldn't
use the size of cint (C-integers), but the size of SML integers,
Int.int, but I couldn't see where to find the correct size.

Anyway, after recompiling the compiler, I experimented with a
non-recursive string hash function (well, non-recursive for strings of
at most 16 characters) and, voilà, with a sufficiently large -inline
setting, MLton was able to constant fold the hash (verified by
inspecting the generated code). Actually, MLton wasn't just able to
constant fold the vector expression, but also several other hash
operations after hashing the string (the string (actually two
4-character strings) was only a part of the data to be hashed). The
final hash value simply appeared as an immediate operand in the code.

So, I wonder whether there is some reason why constant folding of
vector expressions wasn't already enabled. I can't see how it could
cause any serious adverse effects. Certainly, it doesn't apply in
great many cases, but it can be used to advantage if you know how (by
avoiding loops).

-Vesa Karvonen


More information about the MLton mailing list