[MLton] power pc "port"

Filip Pizlo pizlo@purdue.edu
Sun, 5 Sep 2004 00:08:14 -0500 (EST)


> BTW, feel free to add comments to the code in places like this where
> you learned stuff that wasn't clear initially.  I have no problem with
> checking in patches containing only comments, and it'll make it easier
> for our rapidly growing developer base.

I added a comment explaining how fmt works.

By the way, I found the problem.  The problem is that the negation
function (Word8_neg, Word16_neg) is declared to operate over unsigned
words.  This makes GCC decide to compile Word8_neg as follows:

	neg r3,r3		; negate the argument
	rlwinm r3,r3,0,0xff	; mask off lower 8 bits

Likewise, Word16_neg is compiled as follows:

	neg r3,r3		; negate the argument
	rlwinm r3,r3,0,0xffff	; mask off lower 16 bits

What does this mean?  Since my PowerPC only has 32-bit registers (other
PowerPCs may only have 64-bit registers, which leads to even more fun),
8-bit signed and 16-bit signed words get stuffed into 32-bit registers
with sign extension.  Ordinarily, the compiler will use such a combination
of instructions when dealing with these 8-bit and 16-bit words that the
sign extension is not lost.  However, in the above case, the compiler has
decided to blow away the sign extension quite blatantly.  Why?  Because
the declared return type, as well as the argument type, is unsigned.  By
my reading of section 6.5.3.3 of the C standard, in order to do negation
on an unsigned, it must first get 'promoted' to a signed type.  But then,
because the return type is unsigned, the result of the negation must be
converted back to unsigned.  And it is this conversion back to unsigned
that gets compiled as a masking operation that blows away the sign
extension.

In any case, the fix seems to me to be to simply redeclare the _neg
function to be signed.  I'll try this out next.  But it would be good to
know if anyone can think of any complications to doing this.

Overall, I think it would be a good idea to make the code that MLton
generates mark things signed and unsigned more carefully.  Right now, the
C variables that represent 8-bit words are always unsigned, even though
the value they represent may be signed.  Same with 16-bit words.  I fear
that this may cause other problems (either on PowerPC or on other
platforms), since the C compiler does not have accurate information about
whether or not these values should have a sign extension.

--
Filip Pizlo
http://bocks.psych.purdue.edu/
pizlo@purdue.edu