[MLton] What are the semantics for Word_addCheck?

Matthew Fluet fluet@cs.cornell.edu
Wed, 7 Jun 2006 23:18:14 -0400 (EDT)


> What are the semantics for Word_addCheck?

              | Word_addCheck of WordSize.t * {signed: bool} (* codegen *)

In the (S)XML intermediate language, it performs a twos compliment 
addition of two values of the appropriate WordSize, implicitly raising the 
Overflow exception.  The {signed: bool} component determines whether the 
addition is signed or unsigned for the purposes of overflow checking.

In the later intermediate languages, the Word_addCheck primitive may only 
appear in an Arith transfer, where it explicitly names the success and 
overflow labels.

> Also, what other primitives are we likely to come across in a simple program?

If you are pulling in the Basis Library, you are likely to see most of 
them.  The notable exceptions are that you won't encounter:
   MLton_deserialize
   MLton_serialize
   MLton_share
   MLton_size
   MLton_touch
   Thread_*
   Weak_*
   World_save

Also, assuming that you are performing the CPS transform on the SXML 
intermediate language but after the SXML simplification (which would 
probably be the best place to do so), then you also won't see:
   Exn_*
   TopLevel_*

> Basically, we've finished a CPS transform, and we're now trying to interpret
> them in order to ensure that the transform was correct (before we start running
> things like Delta CFA and Gamma CFA on them).  Most of the interpreter is
> written except for the Prim Ops, and we're trying to figure out what those do
> and how they work.

Could you explain a little more what you are doing?  You are writing an 
interpreter for the SXML intermediate language?  That seems like quite a 
bit of work.  Can't you just continue compiling the CPS transformed 
program?

> Obviously, if we don't need to do all of them, that would be favorable. 
> If not, an explanation of those primitives we need to do would be cool.

Like I said, you're going to need almost all of them.  In particular, much 
of the Basis Library relies on the FFI primitives to make C-calls.

In any case, here's a brief description of the primitives, excluding those 
mentioned above:

   Array_array : Int32.int -> 'a array
   Array_array n
     allocate an array of length n; in SXML, the array is monomorphic;
     the Basis Library ensures that n >= 0
   Array_array0Const : unit -> 'a array
   Array_array u
     allocate an array of length 0; in SXML, the array is monomorphic
   Array_length : 'a array -> Int32.int
   Array_length a
     length of an array; in SXML, the array is monomorphic
   Array_sub : 'a array * Int32.int -> 'a
   Array_sub (a, i)
     unchecked array subscript; in SXML, the array is monomorphic;
     the Basis Library ensures that 0 <= i < Array_length a
   Array_toVector : 'a array -> 'a vector
   Array_toVector a
     cast an array to a vector; in SXML, the array and vector are
     monomorphic; the Basis Library ensures that the array is not mutated
     after the cast
   Array_update : 'a array * Int32.int * 'a -> unit
   Array_sub (a, i, x)
     unchecked array update; in SXML, the array is monomorphic;
     the Basis Library ensures that 0 <= i < Array_length a

   FFI : ???
     call a C function
   FFI_symbol : string -> Word32.word
   FFI_symbol "name"
     return the address of the symbol "name"; like '&name' in C

   GC_collect: unit -> unit
     perform a garbage collection

   IntInf_{binop} : IntInf.int * IntInf.int * Int32.int -> IntInf.int
   IntInf_{unop} : IntInf.int * Int32.int -> IntInf.int
   IntInf_toString : IntInf.int * Int32.int -> string
     perform infinite precision integer operations; note that the Int32.int
     argument is the maximum size of the result object; the limit check
     insertion pass uses this argument to ensure that sufficient heap space
     is available.
   IntInf_compare : IntInf.int * IntInf.int -> Int32.int
     compare infinite precision integers; result compares to 0 in the same
     manner as the first argument compares to the second.

   IntInf_toVector : IntInf.int -> Word32.word vector
   IntInf_toWord : IntInf.int -> Word32.word
   Word_toIntInf : Word32.word -> IntInf.int
   WordVector_toIntInf : Word32.word vector -> IntInf.int
     cast between representations; the invariant is that an IntInf.int
     denoting a value in [-2^31, 2^31 - 1] is represented by a
     Word32.word, where the upper 31 bits are the twos complement
     representation and the low bit is 0, while an IntInf.int denoting a
     value in (-inf, -2^31-1] U [2^31, inf) is represented by a (pointer
     to a) Word32.word vector in which the 0th element is the sign (0 for
     positive, 1 for negative) and the remaining elements are the twos
     complement representation of the absolute value of the integer.  The
     Basis Library and runtime maintain this invariant.  Note that the
     Basis Library performs IntInf_toWord on integers represented by a
     (pointer to a) Word32.word vector, checking the low bit to determine
     whether or not the integer is 'small'.

   MLton_bogus : unit -> 'a
     create a bogus value of any type; should never actually be evaluated
     in a good program; in SXML, result will be monomorphic.

   MLton_bug : string -> unit
     terminate the program by calling the MLton_bug C function with the
     string argument.

   MLton_eq: 'a * 'a -> bool
     physical equality of arguments; in SXML, arguments will be
     monomorphic.
   MLton_equal: 'a * 'a -> bool
     SML polymorphic equality of arguments; in SXML, arguments will be
     monomorphic.  Semantics are that of SML: arrays and refs are equal
     only when they are the same object, all other types (including
     vectors) are (recursively) compared for structural equality.

   MLton_halt : Word8.word -> 'a
   MLton_halt w
     terminate the program with status w.

   Real_Math_*
   Real_*
     IEEE floating point operations

   Real_toWord : Real<N>.real -> Word<N>.word/Int<N>.int
   Real_toWord r
     coerce r to an integer, like  (int32_t)r  in C; follows current
     rounding mode; Basis Library ensures that the value may be represented
     by an (unsigned) integer

   Real_toReal : Real<N>.real -> Real<M>.real
   Real_toReal r
     coerce r to a real of a different size, like  (double)r  in C; follows
     current rounding mode

   Ref_assign : 'a ref * 'a -> unit
   Ref_assign (r, x)
     update reference cell; in SXML, reference is monomorphic
   Ref_deref : 'a ref -> 'a
   Ref_deref r
     dereference reference cell; in SXML, reference is monomorphic
   Ref_ref : 'a -> 'a ref
   Ref_ref x
     allocate reference cell; in SXML, reference is monomorphic

   String_toWord8Vector : string -> Word8.word vector
   Word8Vector_toString : Word8.word vector -> string
     identity coercion

   Thread_*
     complicated; probably has non-trivial interaction with CPS conversion

   Word_*Check
     integer operations; implicitly raise Overflow exception
   Word_*
     integer operations

   Word_toReal : Word<N>.word/Int<N>.int -> Real<M>.real
   Word_toReal w
     coerce w to a floating-point, like  (double)w  in C; follows current
     rounding mode

   Word_toWord : Word<N>.word/Int<N>.int -> Word<M>.word
   Word_toWord w
     coerce w to a word of a different size, like (int32_t)w  in C; if
     {signed = true}, then perform sign extension.

   Word8Array_subWord : Word8.word array * Int32.int -> Word32.int
   Word8Array_updateWord : Word8.word array * Int32.int * Word32.word -> unit
   Word8Vector_subWord : Word8.word vector * Int32.int -> Word32.int
     unchecked access to the i-th Word32.word in the sequence; the Basis
     Library ensures that 0 <= 4 * i < length(seq).