[MLton] power pc "port"

Sun, 5 Sep 2004 18:47:59 -0500 (EST)

> Your explanation of what the C standard says makes sense.  What I
> don't understand is why it would lead to an unexpected result.
> Suppose that w1 is of type Word8, i.e. unsigned char.  Your
> explanation says that Word8_neg (w) returns
> 
> 	(Word8)(0xFF & (- ((int)w)))
> 
> That seems correct to me.

It leads to an unexpected result when you try to perform a signed
operation.  For example, consider that an ML program takes the value 1
with type Int8, negates it, and then divides it by 10 using the quot
function (this is more-or-less where the fmt function was failing).  
Let's look at the steps that take place in the C code when run on a
PowerPC:

1) The value 1 gets loaded into a register.  The register contains, in
binary, the bits 00000000000000000000000000000001.

2) The value 1 gets negated.  The register contains, in binary, the bits
11111111111111111111111111111111.  This is -1, which is what you
wanted.

3) Then the signed->unsigned conversion happens, and the upper 24 bits get
zeroed.  Now the register contains 00000000000000000000000011111111.  Note
that when you look at this as a 32-bit register (which is how the PowerPC
chip looks at it), you will see the value 255, regardless of whether you
interpret it as a signed or unsigned value.  This is not what you want.

4) Now the quot function gets called, and we do a divw (32-bit division)
with the denominator being 10.  Divw sees 255/10, and generates 25 as the
answer.  This is of course wrong, because what you wanted was -1/10, which
is 0.

Hence my fix was to make functions that rely on sign (like quot) to first
perform a sign extension.  This makes the above example work because divw
all of a sudden sees -1/10, which is what you wanted.

> Consider the following C program.
> 
> ----------------------------------------------------------------------
> #include <stdio.h>
> 
> typedef unsigned char Word8;
> 
> Word8 Word8_neg (Word8 w) {
> 	return -w;
> }
> 
> int main () {
> 	int i;
> 	Word8 w1, w2, w3;
> 
> 	for (i = 0; i <= 255; ++i) {
> 		w1 = i;
> 		w2 = (Word8)(0xFF & (- ((int)w1)));
>  		w3 = Word8_neg (w1);
> 		fprintf (stderr, "%d  %d  %d\n", (int)w1, (int)w2, (int)w3);
> 	}
> ---------------------------------------------------------------------- 
>
> This program produces identical output on my x86, Sparc, and G5
> machines.  And the output is exactly what I would expect.  What do you
> see on your G4 machine?

I've attached the output.

There are two problems with the test:

1) In this test, the compiler has accurate type information.  In this
sense it is not consistent with what is happening in the MLton backend,
where a variable of type Word8 may be passed to WordS8_neg, the
implementation of which (see Word.c) takes a signed char, while its
declaration as generated by MLton takes an unsigned char.

2) This test performs no operations where the sign would actually matter.
Let's assume that w1 is 1, and the following line of code from your test 
gets executed:

>               w2 = (Word8)(0xFF & (- ((int)w1)));

w2 will now contain 255.  But now imagine that the value in w2 is
reinterpreted (by way of a function prototype that does not match its
definition) as a signed char, AFTER being passed as an argument through a
32-bit register.  Then imagine that this function performs division.  The
division will see 255 instead of -1, and you're stuck with an incorrect
result.

--
Filip Pizlo
http://bocks.psych.purdue.edu/
pizlo@purdue.edu