[MLton-devel] Fwd: Re: pretty damn good

Brad Lucier lucier@math.purdue.edu
Mon, 4 Nov 2002 21:29:39 -0500 (EST)


<Discussion of Gambit vs Mlton elided.>

> Someone should really check that they are computing the same thing
> before we conclude too much. :-)

That is really no joke.  The Scheme code has a correctness check
in it that the ML code doesn't have.  I'm not suggesting the ML code
is incorrect, but I did not intend to send those benchmark results
to this mail list until I had verified things.  Perhaps you want to
add the same correctness check to the ML code just to make us more
comfortable.


> This comes to 1.046654837699614, so I don't quite understand where the
> 6% comes from.

>From my fantasy; my followup noted the correct ratio.

> Although in the case of nucleic the C and native backends are fairly
> close, in many other cases they are not.  The last time I posted about
> this was over a year ago on comp.lang.ml
> 
> http://groups.google.com/groups?q=insubject:sml+insubject:to+insubject:c&hl=en&lr=&ie=UTF-8&oe=UTF-8&safe=off&selm=9lb1oi%24cao%241%40cantaloupe.srv.cs.cmu.edu&rnum=3
> 
> I suspect that the runtime ratios (C / native) have gotten larger
> since then, since we have continued to improve the native codegen and
> have left the C codegen untouched.

Yeah, well on the Alpha, Gambit kicks MLton's butt ;-).  Also, gcc
optimizations haven't been standing still for the last two years, either.

> > You may (or may not) get a bit more performance by using gcc's
> > computed goto's for returns rather than going through the dispatch
> > table on the chunk switch.
> 
> I second what Henry said.  This was way too buggy when we tried it.

It's not buggy now.  There was an error in the LCM part of the GCSE
pass that was fixed several releases ago.  It doesn't generate great code,
but it doesn't generate buggy code either. Look at the gcc mail archives.
The gcc (old egcs) development team has been fairly responsive to my
bug reports.

> As a simple example, you can compare the C and native backends with
> 
> benchmark -mlton "mlton -native {false,true}"

Actually, the following seems to work in the tests directory

../benchmark -mlton "mlton -native {false,true}" barnes-hut boyer checksum count-graphs DLXSimulator fft fib hamlet imp-for knuth-bendix lexgen life logic mandelbrot matrix-multiply md5 merge mlyacc mpuz nucleic peek psdes-random ratio-regions ray raytrace simple smith-normal-form tailfib tak tensor tsp tyan vector-concat vector-rev vliw wc-input1 wc-scanStream zebra zern

I'd like to know how to play with the options passed to gcc.  For example,
you get the following results:

C back end, default gcc options:

[lucier@dsl-207-066 mlton-20020923]$ mlton -v1 -native false nucleic.batch.sml
MLton starting
   Compile SML starting
      pre codegen starting
      pre codegen finished in 2.34 + 0.72 (24% GC)
      C code gen starting
      C code gen finished in 0.02 + 0.0 (0.0% GC)
   Compile SML finished in 2.36 + 0.72 (23% GC)
   Compile C starting
      gcc -S -I/usr/lib/mlton/self/include -O1 -w -fomit-frame-pointer \
          -fno-strength-reduce -mcpu=pentiumpro -malign-loops=2 \
          -malign-jumps=2 -malign-functions=5 -fschedule-insns \
          -fschedule-insns2 -o /tmp/fileJu3OZc.s /tmp/file7Wr0VU.c
   Compile C finished in 35.92 + 0.0 (0.0% GC)
   Assemble starting
      gcc -c -o /tmp/fileBgLOJ2.o /tmp/fileJu3OZc.s
   Assemble finished in 0.23 + 0.0 (0.0% GC)
   Link starting
      gcc -o nucleic.batch /tmp/fileBgLOJ2.o -L/usr/lib/mlton/self -lmlton \
          -lm /usr/lib/libgmp.a
   Link finished in 0.21 + 0.0 (0.0% GC)
MLton finished in 38.72 + 0.73 (2% GC)

[lucier@dsl-207-066 mlton-20020923]$ time ./nucleic.batch
19.935u 2.435s 0:22.95 97.4%    0+0k 0+0io 108pf+0w
[lucier@dsl-207-066 mlton-20020923]$ time ./nucleic.batch
19.949u 2.433s 0:22.51 99.3%    0+0k 0+0io 108pf+0w

C back end, with these gcc options:

lucier@dsl-207-066 mlton-20020923]$ gcc -I/usr/lib/mlton/self/include -O2 -fomit-frame-pointer -fno-strict-aliasing -fno-math-errno nucleic.batch.c -o nucleic.batch.2 -L/usr/lib/mlton/self -lmlton -lm /usr/lib/libgmp.a

[lucier@dsl-207-066 mlton-20020923]$ time ./nucleic.batch.2
17.816u 2.273s 0:20.21 99.3%    0+0k 0+0io 108pf+0w
[lucier@dsl-207-066 mlton-20020923]$ time ./nucleic.batch.2
17.707u 2.369s 0:22.78 88.0%    0+0k 0+0io 108pf+0w

Native back end:

[lucier@dsl-207-066 mlton-20020923]$ mlton nucleic.batch.sml
[lucier@dsl-207-066 mlton-20020923]$ time ./nucleic.batch
16.923u 2.240s 0:19.36 98.9%    0+0k 0+0io 106pf+0w
[lucier@dsl-207-066 mlton-20020923]$ time ./nucleic.batch
16.845u 2.343s 0:19.29 99.4%    0+0k 0+0io 106pf+0w

Ratio of C back end with default gcc options/native back end:

(19.935+2.435+19.949+2.433)/(16.923+2.240+16.845+2.343)=1.1669056869

Ratio of C back end with new gcc options/native back end:

(17.816+2.273+17.707+2.369)/(16.923+2.240+16.845+2.343)=1.0472999400

So you're losing 12% just by having the wrong gcc options.  If you use
computed gotos, then the best gcc options might change again; but I'd give
up 10-20% performance to get MLton to run on more machines.

Brad


-------------------------------------------------------
This SF.net email is sponsored by: ApacheCon, November 18-21 in
Las Vegas (supported by COMDEX), the only Apache event to be
fully supported by the ASF. http://www.apachecon.com
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel