slow matrix multiply

Stephen Weeks MLton@sourcelight.com
Tue, 10 Jul 2001 18:22:47 -0700


I rewrote matrix.mlton to be more like the ocaml version, using 'a array array
to represent the 2D array and manually hoisting the constant array subscript.
This sped stuff up so that the mlton time is now 2.3 (versus the old 4.8), which
is close to ocaml's 1.4.

Following is the source and annotated assembly.

Now, MLton's code is pretty close to Ocaml's, except for stack slots not being
kept in registers.

   fun loop (k, sum) =
      if k < 0
         then sum
      else loop (k - 1, sum + m1i k * sub (m2, k, j))

loop_54:
	movl (204*1)(%edi),%esp		# %esp = k
	cmpl $0,%esp			# if k < 0
	jl L_235
	movl %esp,%ebp			# %ebp = k
	decl %ebp			# %ebp = k - 1
	movl (196*1)(%edi),%edx		# %edx = m1i
	movl %esp,%ecx			# %ecx = k
	movl (%edx,%ecx,4),%esp		# %esp = m1i k
	movl (160*1)(%edi),%edx		# %edx = m2
	movl (%edx,%ecx,4),%ebx		# %ebx = sub (m2, k)
	movl %ebp,(204*1)(%edi)		# store k
	movl %esp,%eax			# %eax = m1i k
	movl (192*1)(%edi),%ebp		# %ebp = j
	cltd
	imull (%ebx,%ebp,4)		# %eax = m1i k * sub (m2, k, j)
	addl %eax,(200*1)(%edi)		# store  sum + ...
	jmp loop_54