bug in IntInf.scan StringCvt.HEX

Stephen Weeks sweeks@intertrust.com
Mon, 10 May 1999 21:44:19 -0700 (PDT)


Here is my assessment of the consequences of the NJ bug on MLton.

G0 will not process IntInf constants correctly.  Thus the G1 C code
will contain the SmallIntInf 0 hardwired in whenever a hex IntInf
constant appears in the source code (or basis library).  Since there
are hex IntInf constants in the basis library in int-inf.sml it is
possible (but not necessary) that the G1 C code is wrong.  This will
very possibly cause G1 to mess up when producing the G2 C code.  Now,
we know from the self compile tests that G2 and G3 C code are the
same.  However, I don't think from this that we can conclude that they
both don't have bugs involving IntInf constants.

My guess is that the constants used in
basis-library/integer/int-inf.sml weren't used in generating G2 and
hence G2 and G3 are OK.  But I don't think it's worth studying in
detail to figure out if that is indeed true.  So, here is my patch to
atoms/small-int-inf.fun to work around the NJ bug.  I believe that
with this patch, GX works correctly for all X.

--------------------------------------------------------------------------------

(* Copyright (C) 1997-1999 NEC Research Institute.
 * Please see the file LICENSE for license information.
 *)
functor SmallIntInf(S: SMALL_INT_INF_STRUCTS): SMALL_INT_INF = 
struct

open S

val hash = Word.fromLargeWord

structure Word = Word32
   
type t = Word.word
   
val equals = op =

fun toCstring w = "0x" ^ Word.toString w

val layout = Layout.str o toCstring

fun toMLstring w = Int.toString(Word.toIntX(Word.~>>(w, 0w1)))

(*
 * The IntInf.fromInt calls are just because SML/NJ doesn't
 * overload integer constants for IntInf.int's.
 * This code relies on the language that MLton is implemented in using at
 * least 31 bits for integers.
 *)
val minSmall: IntInf.int = IntInf.fromInt ~0x40000000
val maxSmall: IntInf.int = IntInf.fromInt 0x3FFFFFFF

(* Warning: fromString will raise the Subscript exception on inputs "" and "~".
 * This code is messier than it needs to be because it works around a bug in the 
 * SML/NJ implementation of IntInf.scan StringCvt.HEX.
 *)
fun fromString (str: string): t option =
       let val size = String.size str
	   fun reader offset =
		  if offset = size
		     then NONE
		     else SOME (String.sub (str, offset), offset + 1)
	   val start = if String.sub (str, 0) = #"~"
			  then 1
			  else 0
	   (* We carefully avoid calling IntInf.scan on the 0x in a hex constant
	    * because of the SML/NJ bug.
	    *)
	   val (base, start) =
	      if (String.sub (str, start) = #"0"
		  andalso (case reader (start + 1) of
			      SOME (#"x", next) => true
			    | _ => false))
		 then (StringCvt.HEX, start + 2)
	      else (StringCvt.DEC, 0)
	   val (v, _) = valOf (IntInf.scan base reader start)
	   (* Do the negation ourselves if we had the scan start inside the
	    * string.
	    *)
	   val v =
	      case (base, String.sub(str, 0)) of
		 (StringCvt.HEX, #"~") => IntInf.~ v
	       | _ => v
       in if IntInf.<= (minSmall, v) andalso IntInf.<= (v, maxSmall)
	     then let val w = Word.fromInt (IntInf.toInt v)
		      val res = Word.orb (0w1, Word.<< (w, 0w1))
		  in SOME res
		  end
	     else NONE
       end

(*
 * If you want to compile MLton using an ML implementation which does not
 * have IntInf, then use the following instead.  Note, in this case Overflow
 * MUST be raised.
 *
 * fun fromString (str: string): t option =
 *        let val size = String.size str
 *            fun reader offset =
 *                   if offset = size
 *                      then NONE
 *                      else SOME (String.sub (str, offset), offset + 1)
 *            val start = if String.sub (str, 0) = #"~"
 *                           then 1
 *                           else 0
 *            val base = if String.sub (str, start) = #"0"
 *                           then case reader (start + 1) of
 *                                SOME (#"x", next) => StringCvt.HEX
 *                                | _ => StringCvt.DEC
 *                           else StringCvt.DEC
 *        in (case Pervasive.Int.scan base reader 0 of
 *            SOME (resv, _) =>
 *               let val resw = Word.fromInt resv
 *                   val res = Word.orb (0w1, Word.<< (resw, 0w1))
 *               in if Word.toLargeIntX (Word.xorb (resw, res)) < 0
 *                  then NONE
 *                  else SOME res
 *               end
 *              | _ => Error.bug "SmallIntInf.fromString")
 *           handle Overflow => NONE
 *        end
 *)

end