[MLton] bool and MLton FFI

Matthew Fluet fluet@cs.cornell.edu
Fri, 23 Jun 2006 07:09:11 -0400 (EDT)


> Exposing bool in the FFI means that either SML bool must have the same
> representation as as C bool or there must be a conversion between the
> representations.

Correct.

> Making the representations identical is a bad idea
> because SML bool is always one of two values, whereas C bool could
> take on more values depending on the size of the type.

This statement isn't quite right; I believe that a C bool should only take 
on the values 0 or 1; For example, casting from some scalar to C should 
result in 0 or 1:

[fluet@localhost temp]$ cat z.c
#include <stdbool.h>

bool f (int i) {
   return (bool)i;
}
[fluet@localhost temp]$ gcc -S z.c
[fluet@localhost temp]$ cat z.s
 	.file	"z.c"
 	.text
.globl f
 	.type	f, @function
f:
 	pushl	%ebp
 	movl	%esp, %ebp
 	cmpl	$0, 8(%ebp)
 	setne	%al
 	movzbl	%al, %eax
 	leave
 	ret
 	.size	f, .-f
 	.ident	"GCC: (GNU) 4.0.2 20051125 (Red Hat 4.0.2-8)"
 	.section	.note.GNU-stack,"",@progbits

Assuming the user is using C correctly, then we ought to be able to assume 
that a C bool is either 0 or 1.  All bets are off if the user violates the 
C spec:

[fluet@localhost temp]$ cat z.c
#include <stdbool.h>

bool g (int *ip) {
   bool *bp = (bool*)ip;
   return *bp;
}
[fluet@localhost temp]$ gcc -S z.c
[fluet@localhost temp]$ cat z.s
 	.file	"z.c"
 	.text
.globl g
 	.type	g, @function
g:
 	pushl	%ebp
 	movl	%esp, %ebp
 	subl	$16, %esp
 	movl	8(%ebp), %eax
 	movl	%eax, -4(%ebp)
 	movl	-4(%ebp), %eax
 	movzbl	(%eax), %eax
 	movzbl	%al, %eax
 	leave
 	ret
 	.size	g, .-g
 	.ident	"GCC: (GNU) 4.0.2 20051125 (Red Hat 4.0.2-8)"
 	.section	.note.GNU-stack,"",@progbits

> Automatic conversion requires getting information about
> the size of C bool into the compiler.  It seems easiest to me to get
> this information to the elaborator, and have it insert the coercions,
> so the rest of the compiler doesn't have to worry about it.

Agreed that the elaborator is the right easiest place to insert coercions. 
Note, we do a coercion like this in the elaboration of _export.
(Mental note: another place that needs to handle the transition to a 
platform dependent C 'bool' is basis-library/mlton/ffi.sml.  This 
particular instance is probably best handled by a C_Bool structure and the 
corresponding C_Bool_ChooseWordN functor.)

> I continue to think that the bool vector issue is a red herring.  If
> bool is in the FFI but bool vector isn't, then the only place the
> conversion has to happen (either manually or automatically) is on the
> bool type.  Once the C bool is converted to an SML bool, the
> right stuff will just happen if that bool is put in a bool vector.

I agree that it is a red herring, in as much as a 'bool' in a bit-vector 
representation of 'bool vector' is not an SML bool -- a coercion must be 
happening at every sub and update.

But, there is the incongruity of allowing both 'Int32.int' and 'bool' in 
the FFI, but not both 'Int32.int vector' and 'bool vector'.

> I dislike (1) because it forces programmers to write C wrappers for
> functions that deal with C bools.  It's always good to avoid C.

I don't think C bool is used that extensively.  All of the places I used 
Bool.t in basis-ffi.def, the C side really was an 'int' expression.

> I think (3) is more intuitive, although I agree that the conversion is 
> special and not done anywhere else.  I view the conversion as making the 
> outside world look nice to SML (converting everything so we can deal 
> with the nice bool datatype) rather than exposing SML bool to the FFI.

That's a decent argument.