[MLton] C codegen and world* regressions

Matthew Fluet fluet@cs.cornell.edu
Sat, 17 Jun 2006 13:57:57 -0400 (EDT)


> The problem is due to changing the primtive from
>  val save = _prim "World_save": C.Fd.t -> unit
> to
>  val save = _prim "World_save": NullString8_t -> bool C_Error.t
>
> What happens is that the late stages of the compiler turn this primitive into 
> a C call.  When the program is saving a world, the ML code calls 
> GC_saveWorld, and, in the absence of errors, control returns to the ML code 
> _with a true return value_.  When the program loads the saved world, the 
> runtime initialization prepares the ML heap and then control returns to the 
> ML code _without a return value_.  Luck of the draw, it appears that the main 
> function in x86-main.h happens to leave a non-zero value in %eax.

It occurs to me that this isn't quite a satisfactory explaination as to 
why things worked with the native codegen.  In particular, Wesley's 
original change returned 'false' when the save succeeded and 'true' when 
the save failed, and my subsequent change reversed the meaning to return 
'true' when the save succeeded and 'false' when the save failed.  Neither 
change modified x86-main.h, so I would expect the jumpToML code to be 
compiled the same, meaning that the (garbage) value in %eax should be the 
same in both versions.  But, that value can't be both 'true' and 'false'. 
Or can it? ;-)

In Wesley's version, he had

             val () =
                SysCall.simple'
                ({errVal = true},
                 fn () => Prim.save (NullString.nullTerm file))

which induces a comparison between the return value and 'true'.  The 
representation pass represents 'true' as 1, so the native-codegen emits:

         cmpl $0x1,%eax
         je L_true_branch
   L_false_branch:

In my version, I had

             val () =
                SysCall.simple'
                ({errVal = false},
                 fn () => Prim.save (NullString.nullTerm file))

which induces a comparison between the return value and 'false'.  The 
representation pass represents 'false' as 0, so the native-codegen emits:

         testl %eax,%eax
         jz L_true_branch
   L_false_branch:

So, an %eax value not-equal to either 0 or 1 always follows the 
L_false_branch (that is, the non-error branch).  Now, it seems pretty 
likely that garbage in %eax is some value that is neither 0 nor 1.

I note that this is a general 'issue' with MLton's FFI; a program like 
this:

   val f = _import "f": unit -> bool;
   val () = if f () = true
               then ...
               else ...

yields Machine IL code like:

     CCall {args = (),
 	   frameInfo = Some {frameLayoutsIndex = 53},
 	   func = {args = (),
 		   bytesNeeded = None,
 		   convention = cdecl,
 		   ensuresBytesFree = false,
 		   mayGC = true,
 		   maySwitchThreads = false,
 		   modifiesFrontier = true,
 		   prototype = {args = (), res = Some Int32},
 		   readsStackTop = true,
 		   return = Word32,
 		   target = f,
 		   writesStackTop = true},
 	   return = Some L_877}
L_877: {kind = CReturn {dst = Some RW32(0): Word32,
 			frameInfo = Some {frameLayoutsIndex = 53},
 			func = {args = (),
 				bytesNeeded = None,
 				convention = cdecl,
 				ensuresBytesFree = false,
 				mayGC = true,
 				maySwitchThreads = false,
 				modifiesFrontier = true,
 				prototype = {args = (), res = Some Int32},
 				readsStackTop = true,
 				return = Word32,
 				target = f,
 				writesStackTop = true}},
 	live = (SP(16): Pointers (pt_7),
 		SP(44): Pointers (pt_23),
 		SP(40): Pointers (pt_22)),
 	raises = None,
 	returns = Some ()}
     RW32(1): Word32  = Word32_equal (RW32(0): Word32, 0x1)
     switch {test = RW32(1): Word32,
 	    default = None,
 	    cases = ((0x0, L_876), (0x1, L_864))}

So, every codegen will compare the result value to 0x1.  Similarly, a 
program like:

   val f = _import "f": unit -> bool;
   val () = if f () = true
               then ...
               else ...

will yield a Machine IL proram that compares the result value to 0x0.

Meanwhile, a program like:

   val f = _import "f": unit -> bool;
   val () = if f ()
               then ...
               else ...

yields a Machile IL program with:

     CCall {args = (),
 	   frameInfo = Some {frameLayoutsIndex = 53},
 	   func = {args = (),
 		   bytesNeeded = None,
 		   convention = cdecl,
 		   ensuresBytesFree = false,
 		   mayGC = true,
 		   maySwitchThreads = false,
 		   modifiesFrontier = true,
 		   prototype = {args = (), res = Some Int32},
 		   readsStackTop = true,
 		   return = Word32,
 		   target = f,
 		   writesStackTop = true},
 	   return = Some L_877}
L_877: {kind = CReturn {dst = Some RW32(0): Word32,
 			frameInfo = Some {frameLayoutsIndex = 53},
 			func = {args = (),
 				bytesNeeded = None,
 				convention = cdecl,
 				ensuresBytesFree = false,
 				mayGC = true,
 				maySwitchThreads = false,
 				modifiesFrontier = true,
 				prototype = {args = (), res = Some Int32},
 				readsStackTop = true,
 				return = Word32,
 				target = f,
 				writesStackTop = true}},
 	live = (SP(16): Pointers (pt_7),
 		SP(44): Pointers (pt_23),
 		SP(40): Pointers (pt_22)),
 	raises = None,
 	returns = Some ()}
     switch {test = RW32(0): Word32,
 	    default = None,
 	    cases = ((0x0, L_876), (0x1, L_864))}

The semantics of a Machine IL 'switch' transfer is that the 'cases' (+ 
'default') exhaustively cover the 'test' value.  So, the codegens are free 
to optimize this to a comparison with either 0x0 or 0x1, assuming that an 
unequal comparison implies that the value is the other value. I point this 
out because if the C-function f returns a value which is neither zero nor 
one, then the program compiled with two different codegens might exhibit 
different behavior.

I think this is o.k.; all it means is that an ML 'bool' is not the same as 
a C conditional expression; rather it is closer to the C99 bool_t type. 
Using a value not equal to either zero or one for an ML 'bool' leads to 
undefined behavior.

> 2) return the status of the save out-of-band; this is essentially the
>    solution used by the thread primitives; we could add a
>      val getSaveStatus =
>         _import "GC_saveWorldSatus" : GCState.t -> bool C_Errno.t
>    and check the status after the save.  I think this is the way to go.

I've checked in a fix along these lines.