lists,vectors,arrays in SML/NJ and MLton

Matthew Fluet fluet@research.nj.nec.com
Tue, 8 Aug 2000 12:51:05 -0400 (EDT)


Yesterday I added traceCall's to the various passes in the x86-codegen
(translate, simplify, allocate-registers, validate) and started looking at
their times during compilations using the sml/nj compiled version of
MLton.

For a self-compile, they were:
translate            183.701
simplify             127.799
allocate-registers  1403.254
validate               4.096

So, allocate-registers is by far the most expensive pass.  I thought that
one major source of inefficiency was the fact that I used a vector of
lists to represent the register allocation; each element in the vector
corresponds to an register (eg. eax) and the elements in a list correspond
to the "parts" of the register (eg., eax, ax, ah, al).  So, on
modification of the register allocation, I had lots of Vector.maps and
List.maps, which meant lots of copying (I would think).  But, the
register allocation is being used linearly, so there was no real reason
not to make it an array of lists or even an array of arrays.  I tried
both of them, but got the following strange result:

allocate_registers (array of arrays) 32.186
allocate_registers (array of lists)  11.913
allocate_registers (vector of lists)  8.391

That doesn't make any sense to me.  Does anyone know of any peculiarities
to SML/NJ's implementation of arrays?  I'm using mostly Array.sub and
Array.update with a couple of Array.fold and Array.foreachi (more folds
in the array of arrays and more foreachis in the array of lists).  Would
this same behavior appear using the MLton compiled version of MLton?  Any
other explaination for the slowdown?