fix for wordfreq for MLton

Henry Cejtin henry@sourcelight.com
Tue, 12 Jun 2001 23:12:23 -0500


I already E-mail'd Stephen, but he hasn't answered yet: the reason that
wordfreq is so slow in MLton is because of a bug in the hash function used.
I sent the fix to him, but if you just change the hash function so that it
actually loops it will run MUCH better (currently it is only returning
26 different hash values, so the buckets are really big).
Here is the fixed hash:


(* This hash function is taken from pages 56-57 of
 * The Practice of Programming by Kernighan and Pike.
 *)
fun hash (s: string): word =
   let
      val n = String.size s
      fun loop (i, w) =
	 if i = n
	    then w
	 else loop (i + 1,
		    Word.fromInt (Char.ord (String.sub (s, i)))
		       + Word.* (w, 0w31))
   in
      loop (0, 0w0)
   end

After all, we on the MLton team have to show how fine it is.