[MLton] Performance of Real.toInt

Mon Oct 27 08:48:02 PST 2008

On Sun, 26 Oct 2008, Vesa Karvonen wrote:
> On Fri, Oct 24, 2008 at 10:50 PM, Ryan Newton <rrnewton at gmail.com> wrote:
>> Under MLton I generate code like this:
>>
>>  (Real64.toInt IEEEReal.TO_ZERO (var_tmpsmp_77))
>>
>> But it performs very poorly.  I haven't researched this, but if I had
>> to guess, I'd bet this is because mlton is implementing some more
>> semantically meaningful notion than C casts.
>
> An excellent guess!
>
>> Nevertheless, is there
>> any inexpensive way to ape the behavior one gets from (int)x in C?
>
> Have you peeked into the real/real.sml source file in MLton's basis
> library implementation?  The implementation of Real.toInt uses a
> family of toInt<N>Unsafe functions, that do not set the rounding mode
> or check that the floating point number is in the range of the integer
> type.  One could perhaps extend the MLton.Real structure
> (http://mlton.org/MLtonReal) to expose those functions.  You could
> then implement the conversion in terms of the unsafe functions.

As Vesa noted, SML's Real.toInt function does a lot more range checking 
than C's (int)d cast.  In SML, there are at least two floating-point 
comparisons (performing the range check), a rounding mode set, a 
floating-point round, a rounding mode (re)set, and a floating-point to int 
coercion (the toInt<N>Unsafe).

If you are using the C codegen, then toInt<N>Unsafe is implemented by a C 
cast; the semantics of a C cast is to convert with truncation (TO_ZERO) 
semantics.  If you are using the x86 codegen, then toInt<N>Unsafe is 
implemented by the 'fist' instruction; the semantics of the 'fist' 
instruction is to convert with the current rounding mode.  If you are 
using the amd64 codegen, then toInt<N>Unsafe is implemented by the 
'cvt{s,d}2si{l,q}' instruction; the semantics of the 'cvt{s,d}2si{l,q}' 
instruction is to convert with truncation (TO_ZERO) semantics.  Since the 
implmentations of toInt<N>Unsafe do not always obey the current rounding 
mode, the SML implementation first does a floating-point round (under an 
appropriate rounding mode); thus, all of the toInt<N>Unsafe 
implementations behave the same.  But, it also means that the 
toInt<N>Unsafe primitives are only well defined when the floating-point 
value is an integer; on non-integeral floating-point values, the different 
codegens could return different results.

Note: on x86 with the C-codegen, the C cast actually generates another 
set/reset of the rounding mode, because gcc wants to use the 'fist' 
instruction, but with truncation (TO_ZERO) semantics (rather than the 
current rounding mode).  This may also be the case on other architectures.

If you are exclusively using the C-codegen, the exposing the 
toInt<N>Unsafe functions in the MLton.Real structure would have the 
behavior of a C-cast.  (It will still be a little slower, because the cast 
will occur in a non-inlined function; we don't inline some of the 
floating-point operations, because gcc will constant fold without obeying 
possible changes in the rounding mode.  Though, given the explaination 
above, since C's cast always ignores the current rounding mode and uses 
truncation semantics, then it may be acceptable to inline.)

If you wanted something a little more well-defined, you could expose in 
MLton.Real the composition of Primitive.Real<N>.round with 
Primitive.Real<N>.toInt<M>Unsafe.  That would first do a floating-point 
round to integer (under the current rounding mode), followed by a coercion 
to int (which, because the input will be an integral floating-point, will 
be well-defined for all implementations).  However, this would be slightly 
different from a C-cast, since the default floating-point rounding mode is 
TO_NEAREST (at least on x86 and amd64, and possibly specified by C99 
and/or IEEE754), not TO_ZERO.

So, lots of choices, but nothing jumps out as a clear winner.