lists,vectors,arrays in SML/NJ and MLton

Stephen Weeks MLton@sourcelight.com
Tue, 8 Aug 2000 10:29:05 -0700 (PDT)


> So, allocate-registers is by far the most expensive pass.  I thought that
> one major source of inefficiency was the fact that I used a vector of
> lists to represent the register allocation; each element in the vector
> corresponds to an register (eg. eax) and the elements in a list correspond
> to the "parts" of the register (eg., eax, ax, ah, al).  

I glanced at what you have and I think the best representation for
RegisterAllocation.t would be a record with mutable fields 
  type t = {eax: entry, ebx: entry, ..., esp: entry, 
            reserved: Register.t list}
(where entry is modified to be mutable)

Records have a better representation than arrays since the size is known at
compile time and is not stored in the object.  They are also faster, since
selects are at offsets known at compile time and not computed as array
subscripts are.

> So, on
> modification of the register allocation, I had lots of Vector.maps and
> List.maps, which meant lots of copying (I would think).  But, the
> register allocation is being used linearly, so there was no real reason
> not to make it an array of lists or even an array of arrays.  I tried
> both of them, but got the following strange result:
> 
> allocate_registers (array of arrays) 32.186
> allocate_registers (array of lists)  11.913
> allocate_registers (vector of lists)  8.391
> 
> That doesn't make any sense to me.  Does anyone know of any peculiarities
> to SML/NJ's implementation of arrays?

The only thing I can think of is the fact that they have a generational
collector and so assignments cost them.

> Would
> this same behavior appear using the MLton compiled version of MLton? 

I would be surprised.  In fact, there will be no difference between arrays and
vectors, and I would bet that mutable arrays are faster than copied lists.  But
again, I would rather see record types wherever possible.