A Simple Analysis to Eliminate Overflow Checking on Increments

Henry Cejtin henry@sourcelight.com
Fri, 13 Jul 2001 18:20:24 -0500


Yes, the fact that if I use `=' and start in the wrong place my loop will
continue until overflow is good: it means the program will die right there.
If I use `<=', it will stop right away, but clearly I was confused and this
may result in other errors.  The notion is to die as quickly as possible near
any error.
As to bounds checking, my notion was that the important thing is to realise
that if i starts out in [0, Array.length a], then
	fun loop i =
	       if i = Array.length a
		  then ()
		  else (Array.update (a, i, ???);
			loop (i + 1))
is ok (by induction).  Of course you need the `eureka' step of producing that
inductive assumption, but given the Array.update, it isn't bad.

On  an  very  weakly  related  note, we really need something better than the
basis library as a way of creating arrays, and, more importantly, strings and
vectors.   (These  are  more  important  because  being  immutable, it really
requires unsafe casts or another copy.)

The point is that tabulate is a convenient special case, but not the  general
case  that  you  want.   The  general  case  you  want is unfold.  It takes 3
arguments: the size (length of the result), the state, and a function of type
where  'elem  is  the  type  of the elements.  In addition to this, you would
unfoldL, which produces the elements in the reverse order.  (I have no notion
of the correct capitalization for this.)

Using these, tabulate becomes
    fun tabulate (size: int, proc: int -> 'elem) =
           unfold (size,
                   0,
                   fn i => (proc i, i + 1))

With  these,  including  unfoldL, lots of things are easy and efficient.  One
typical example is where you have a list of characters and you want to  build
the  string  which  is  those  characters  but  in reverse order (because you
accumulated the characters in a list, so that reversed it).   The  fact  that
currently you have to reverse the list is silly.

Other  cases  where  you  have  to carry along some state as you build up the
elements are very common and currently you have to pick one of the following:

    Produce an array instead of a vector and then do a copy at the end.

    Use a ref cell to hold the state and mutate as you walk along.

I  hate  both of these.  Besides, the unfold and unfoldL really are the right
idiom and they should be supported by the basis library.