[MLton-user] more optimization questions

brian denheyer briand@aracnet.com
Tue, 20 Dec 2005 22:19:49 -0800


> by a C call to a function that just calls fabs.  The win in going from
> (2) to (3) is in eliminating the C wrapper around fabs.  If anyone
> wants to repeat my experiment, I did (3) by adding a line to
> lib/mlton/include/c-chunk.h:
>
>   #define Real64_abs fabs
>
Stephen,

Thanks very much for spending the time to look at this.

Just FYI for the list.  The above change to the c-chunk.h file requires:

val abs = _import "fabs": real -> real;

to be added to the .sml file.  I verified that the proper code was  
generated using
-keep g.

My results are:

C            3.9s
sml + fabs   6.2s
orig         13s

So about a 60% difference.

Much better than the starting point of 240% !

Now for the gotcha.

My effort to construct the 2D example was to examine the array  
indexing more easily.  I brought up the 2D program as an example of  
the slow down, because it seemed to track the performance of the 3D  
program relative to C and therefore led me to believe the problem was  
in the indexing.

However my 3D program is a different routine, and does NOT use fabs.   
So it's back to the drawing board.  However I have some good tools  
now to investigate the performance.

Also, it seems like in 2 years there is a relatively good (?) chance  
that the fabs behavior has changed.  Maybe the "proper" abs code is  
no longer required in the compiler.

Brian