[MLton] MLton broken FFI on AMD64???

Thu Feb 17 10:43:16 PST 2011

On Thu, Feb 17, 2011 at 10:56 AM, Wesley W. Terpstra <wesley at terpstra.ca> wrote:
> On Mon, Feb 14, 2011 at 7:07 PM, Matthew Fluet <matthew.fluet at gmail.com>
> wrote:
>>
>> BTW, it occurs to me that there isn't a good solution to this problem
>> with the C codegen.  With the C codegen, MLton emits prototypes for
>> _import-ed functions that are derived from the type of the imported
>> function.  The prototype assumes that the function is a non-varargs.
>
> That's a good point. Yuck.
>
>> [There is a second issue with Henry's particular example, where the
>> MLton emitted prototype for printf disagrees with the prototype
>> exported by stdio.h, so one actually gets a compile error there.]
>
> It's not just printf. I know of several platforms where system calls in C go
> through some header magic-fu. There are #define's that rename things
> depending on the _XOPEN_SOURCE / etc selection. Importing *any* symbols from
> system libraries is rather error-prone.

Sure, there could be symbols that work in a C file because the are
cpp-ed down to the symbol that actually appears in the C library.
Although, POSIX is pretty clear on what things might be implemented by
macros and what things must be true symbols.  But, of course, that
doesn't mean everyone follows it.

> Really, the best solution is to do like the MLton runtime: write your own
> functions that do the C calls and import them instead. That works with the
> system header defines/etc (and also dodges the varargs problem). The
> overhead is quite low; gcc optimizes those proxy methods down to a single
> branch instruction if the prototypes match.

It only dodges the varargs problem to the degree that one only writes
non-varargs functions.  But that seems to be a reasonable restriction.

>> I suppose we could also be conservative here and always emit a varargs
>> prototype.
>
> What do you propose? Append a ", ..." to the end of the prototype?

That's what I meant by being conservative.

> I don't know of any architecture where it is a problem, but might passing a
> variable argument and a normal argument differ on an input-by-input basis?
> Imagine "void foo(int x, ...);" where passing the first integer goes as
> 32-bit on the stack but the second gets sign-extended to 64-bit. I don't
> think anything forbids such a hypothetically evil ABI.

Well, I always thought that there wouldn't be an ABI where the varags
and non-varargs calling conventions wouldn't be different.  What we
see with amd64 is that the varags convention is compatible with the
non-vargs, but not the other way around.  As you say, there seems to
be little that would prevent an ABI from having varargs and
non-varargs conventions incompatible.

>> The other alternative is to simply not support _import-ing of varargs
>> functions.
>
> Whatever we do, we should do it consistently across the codegens.

Agreed.  Though, for all many of the same reasons, there doesn't seem
to be any way to detect/prevent the _import of a varargs functions (at
a fixed number of arguments); it just means that on some platforms and
some codegens it would work and on others it would not.

> If we can't support C properly, then maybe varargs is a losing battle.
> On the other hand, people rather expect 'printf' to work.

I've certainly used the _import "printf" trick when needing a
meaningful, but small program with which to debug one of the compiler
passes.  But, apparently, never before on amd64.

After this discussion, I'm leaning towards not supporting varargs functions.