[MLton] power pc "port"

Stephen Weeks MLton@mlton.org
Mon, 6 Sep 2004 10:26:22 -0700


> The only downside is that in addition to every arithmetic operation
> on 8-bit and 16-bit words requiring a function call,

No.  Most of the arithops are defined in c-chunk.h as static inlines
that will be optimized away.  Only a few (like quot, rem) require
function calls, and that is necessary to keep gcc's optimizer from
doing things that the SML spec doesn't allow.

> each one of those operations will now require one or more sign
> extension and/or masking operations.  Of course, the cost of these
> operations may be minuscule in comparison to the cost of the
> function calls. :-)

You only have to do the sign extension if the value isn't already sign
extended.  I would think in most cases with a sequence of (signed)
arithops, gcc's optimizer would simply keep the value sign extended in
a register, but maybe gcc doesn't do as good a job as I would expect.

> The thing that I worry about is the native codegen for PowerPC.
> Will it have to deal with the same IL as the C codegen?  

Yes, but you could have the optimizer that runs before the Machine IL
do different things when compiling natively.

> In that case, having the IL only know about unsigned words may
> result in a significant performance hit, in addition to unnecessary
> code bloat.

Maybe, but I'm not convinced.  The codegen or some earlier pass will
certainly need to decide at various points how 8 and 16 bit values are
to be represented in a 32-bit register.  It can choose to keep them
sign extended or masked, depending on what's appropriate.  A naive
approach would be to keep all words masked, and then to sign extend
before each arithop and mask after.  But there are better approaches.
Have a look at "Widening Integer Arithmetic" for a framework to think
about this issue.

	http://www.eecs.harvard.edu/~nr/pubs/widen-abstract.html

My guess is that their dynamic programming approach may not even be
necessary (although I'd like to see it) and that some simple
heuristics would do an acceptable job.  Kevin Redwine reads this list
-- perhaps he is interested in helping out with this issue?  In any
case, these heuristics could be expressed in the codegen or as an
optimization on the Ssa IL that eliminates all occurrences of Word8
and Word16.  My preference would be the Ssa IL, since then the
optimization could be used on other platforms.

> I'll finish hacking on IEEEReal.c today, so we should have a working
> port real soon.

Great.  BTW, I tweaked the runtime yesterday to compile all the
Real/*.c files with -O1 instead of -O2, which was causing problems on
Cygwin.  This may help you as well.