[MLton] C codegen and world* regressions

Matthew Fluet fluet@cs.cornell.edu
Sat, 17 Jun 2006 16:49:38 -0400 (EDT)


>>    val f = _import "f": unit -> bool;
> ...
>> I point this out because if the C-function f returns a value which
>> is neither zero nor one, then the program compiled with two
>> different codegens might exhibit different behavior.
>>
>> I think this is o.k.; all it means is that an ML 'bool' is not the
>> same as a C conditional expression; rather it is closer to the C99
>> bool_t type. Using a value not equal to either zero or one for an ML
>> 'bool' leads to undefined behavior.
>
> That seems OK to me too.  The old runtime/basis was sloppy about this,
> confusing bools and ints.  Hopefully the new runtime will completely
> clarify things.  Also once this is all done, the representation pass
> can be tweaked so it packs booleans as single bits -- right now it
> keeps them as 32 bits, at least partially because of confusion between
> C booleans and ints.

I guess I don't quite follow.  As of right now, while we try to be a 
little better about confusing ints and bools, I don't think we're getting 
any more guarantees.  The Bool_t typedef in ml-types.h (which is what the 
C-side is using for an ML bool), simply has

   typedef int32_t Int32_t;
   typedef Int32_t Bool_t;

So, even if we promise to return an ML bool by using Bool_t, C doesn't 
help us much, since it will happily allow

   Bool_t foo() {
     return 42;
   }

and foo will pass 42 back as the return value.

So, I don't know how things are clarified.  We're trying to document our 
intentions better, but I don't think we have any more guarantees.

Maybe you are suggesting that we have

    typedef bool Bool_t;


> I also see that
>
>  http://mlton.org/ForeignFunctionInterfaceTypes
>
> says that the SML bool type is equivalent to the C Int32 type.  We'll
> need to change this to C "bool" type and to warn people about the new,
> more precise type (and the fact that the value must be 0 or 1).

I don't know if that quite works, because I'm not sure that the natural C 
representation of bool[] is necessarily equal to MLton's representation of 
bool array.  In particular, we could turn it into a bit-array, while I 
think C will make it a 8-bit char array.  So you still have pin things 
down to what the C-side does.

Would removing bool from the FFI types be a horrible idea?

> I looked at all the uses of Bool_t in basis-ffi.h to see if the
> functions exported by the runtime were confusing bools and ints.  I
> found a few that I'm not sure about.

Yeah, I added looking for uses of Bool_t to my todo.

>  1. The functions
>
>    PosixFileSys_ST_is{Blk,Chr,Dir,FIFO,Link,Reg,Sock}
>
>  all return Bool_t by calling the corresponding macro
>
>    S_IS{BLK,CHR,DIR,FIFO,LNK,REG,SOCK}
>
>  On my Linux machine these all expand to relational expressions (==),
>  but I don't know if the standard (or standard practice) guarantees
>  these to be booleans.

http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html

The following macros shall be provided to test whether a file is of the 
specified type. The value m supplied to the macros is the value of st_mode 
from a stat structure. The macro shall evaluate to a non-zero value if the 
test is true; 0 if the test is false.

> 2. The function Posix_ProcEnv_isatty returns a Bool_t by calling
>   isatty, which on my machine is specified to return an int.

That's probably right.

> 3. The functions
>
>     Posix_Process_if{Exited,Signaled,Stopped}
>
>   all return Bool_t by calling the corresponding macro
>
>     WIF{EXIT,SIGNAL,STOPP}ED
>
>   Again, on my machine these all expand to relational expressions,
>   but I don't know if the standard guarantees these to be booleans.

http://www.opengroup.org/onlinepubs/009695399/functions/wait.html

WIFEXITED(stat_val)
     Evaluates to a non-zero value if status was returned for a child 
process that terminated normally.
WIFSIGNALED(stat_val)
     Evaluates to a non-zero value if status was returned for a child 
process that terminated due to the receipt of a signal that was not caught 
(see <signal.h>).
WIFSTOPPED(stat_val)
     Evaluates to a non-zero value if status was returned for a child 
process that is currently stopped.

> I wonder if the right thing to do for situations where we are unsure
> or have C functions that return int instead of bool, is to expose them
> as int in the FFI and convert the int type to bool in SML code with a
> comparison to zero.

If we disallowed bool in indirect FFI types (i.e., bool ref, bool array, 
bool vector), I could imagine implementing this as a type directed 
elaboration of FFI functions.