[MLton] power pc "port"

Stephen Weeks MLton@mlton.org
Sun, 5 Sep 2004 22:11:02 -0700


Thanks for the detailed explanation Filip.  Now I understand.  The
problem was not with Word8_neg, but with WordS8_quot (and others).
Your key observation was that the prototype that MLton generates for
WordS8_quot did not match the prototype used in its definition in
Word.c.  Here is the prototype MLton generates

	Word8 WordS8_quot (Word8 x1, Word8 x0);

and here is the definition prototype

	WordS8 WordS8_quot (WordS8 x1, WordS8 x0);

the difference being the signed vs unsigned arguments and result.

Here is a simple C program that demonstrates the problem.  It requires
two files: a.c and b.c.

----------------------------------------------------------------------
// a.c
#include <stdio.h>

typedef unsigned char Word8;
typedef Word8 WordU8;
typedef char WordS8;

Word8 Word8_neg (Word8 w) {
	return -w;
}

Word8 WordS8_quot (Word8 w1, Word8 w2);

int main () {
	int i;

	fprintf (stderr, "%d\n", (int)(WordS8_quot (Word8_neg (1), 10)));
	return 0;
}
----------------------------------------------------------------------
// b.c
typedef unsigned char Word8;
typedef Word8 WordU8;
typedef signed char WordS8;

WordS8 WordS8_quot (WordS8 w1, WordS8 w2) {
	return w1 / w2;
}
----------------------------------------------------------------------

Compiling these with gcc -O1 yields a program that prints "0" on my
x86 machine, "25" on my G5, and "0" on my Sparc.  

Correcting the prototype in a.c to

	WordS8 WordS8_quot (WordS8 w1, WordS8 w2);

corrects the problem, causing the G5 executable to print "0".

My conclusion is that gcc uses different calling conventions for
signed and unsigned chars on G5, but not on x86 or Sparc.

The only fix that I see is to change MLton's FFI so that it keeps
track of the difference between signed and unsigned words of various
sizes so that we generate the appropriate prototypes in the generated
C code.  This does not require the compiler to distinguish between
signeds and unsigneds in its ILs, merely to keep enough information so
that it can follow the conventions used by the outside world (i.e. C)
when it needs to communicate with it.

Does this make sense?  Any other ideas?

If we're in agreement on the solution, I'm happy to make the changes
tomorrow (Monday).  It shouldn't be too bad, modifying
atoms/c-type.{fun,sig} to add the new types and modifying a little bit
in elaborate/elaborate-core.fun to make the distinctions.