[MLton] interrupted system call

Stephen Weeks MLton@mlton.org
Fri, 26 Mar 2004 17:45:06 -0800


> It is an optimization, but critical sections are doing more work as
> a consequence of this discussion.  In particular, if atomicEnd
> necessitates a GC_gc in the CFG, then we'll be forcing more SSA
> variables onto the stack.  That's an expense.

OK.  That could be worth it.

> >       fun call (err: int -> 'a): 'a =
> >          let
> > 	    val () = atomicBegin ()
> >             val (n, post) = f ()
> >          in
> >             if n = ~1
> > 	       then (atomicEnd (); err (getErrno ()))
> > 	    else (post () before atomicEnd ())
> >          end
> 
> atomicEnd will invoke the signal handler; so, errno could be bogus when we
> continue executing here.  So, we need to getErrno before atomicEnd.

Agreed.

> Note also that this is a situation where atomicBegin/End is simpler (and
> obviously correct) than MLton.Thread.atomically.  You can do it with with
> atomically, but it's just more CPS thunking.

I agree in general, but this one isn't too bad.

         atomically
	 (fn () =>
	  let 
	     val (n, post) = f ()
	  in
	     if n = ~1
		then let val e = getErrno () in fn () => err e end
	     else post
         end) ()

It does make it clear that the atomicBegins and atomicEnds line up.
Suppose for example that f raises an exception :-).

> Don't forget !MLton.Signal.restart.

I wasn't sure about the interaction of that with the new restart
argument to syscall.  I thought that since you now explicitly passed
restart that maybe you didn't want the bool ref any more.  If you
still have the bool ref, then maybe it makes sense to get rid of the
restart argument, and use fluidLet when you need it?  I don't know.

> val syscall: (unit -> int * (unit -> 'a)) * {restart: bool} -> 'a
...
> I understand that this is equivalent to my signature.  

I don't think they are equivalent.  I think mine is more powerful.
Abstracting a little bit, I don't see how

    signature S =
       sig
          val s: 'a -> 'b * (unit -> 'c)
       end

is equivalent to

    signature M =
       sig
          val m: ('a -> 'b) * ('b -> 'c)
       end

I can see how to go from M to S.

    functor MToS (M: M): S =
       struct
          fun s a =
             let
                val b = #1 M.m a
             in
                (b, fn () => #2 M.m b)
             end
       end

But I don't see how to go from S to M.

In practical terms, your type forces me to manually closure convert
when I want to communicate information from pre or call to post.  That
is, where I would write
  
   syscall (fn () =>
	    let
	       val x = ...
	    in
	       (_, fn () => ... x ...)
	    end,
	    {restart = true})

you type will force me to partially closure convert the second thunk
and put the closure record in the result of call.  That is painful.

> I find it harder to read at a glance to pick out the distinct
> portions of the code;

OK, but it also seems nice read the entire behavior of the call in one
place, pretending that it is atomic.  I find it hard to see how

     syscall
     {restart = true,
      pre = fn (x, {a, b}) => (F_Foo_setA(a)
                               ; F_Foo_setB(b)
                               ; x)),
      call = fn x => MLton_F x,
      return = fn n => n,
      post = fn _ => let val y = F_Bar_getY()
                         val z = F_Bar_getZ()
                     in {y = y, z = z}
                     end}

is more readable than

    syscall (fn () => (F_Foo_setA a
 		        ; F_Foo_setB b
 		        ; (MLton_F x,
 		  	   fn () => {y = F_Bar_getY (),
 			  	     z = F_Bar_getZ ()})),

             {restart = true})

With my type, if you want to name parts of the computation, you can.

   syscall (fn () =>
	    let
	       fun pre () = (F_Foo_setA a
			     ; F_Foo_setB b
			     ; MLton_F x)
	       fun post () = {y = F_Bar_getY (),
			      z = F_Bar_getZ ()}
	    in
	       (pre (), post)
	    end,
	    {restart = true})

You could even modify my type to force the naming

   val syscall: {call: unit -> {return: int} * {post: unit -> 'a},
                 restart: bool} -> 'a

Although I prefer the less verbose version.

Furthermore, going back to your type

   {wrapAtomic: bool,
    wrapRestart: bool,
    pre: 'a -> 'aa,
    call: 'aa -> 'bb,
    return: 'bb -> int,
    post: 'bb -> 'b} ->
   ('a -> 'b)

there are a couple things that seem arbitrary to me.  What is the
reason for separating pre from call and return from post?  What is the
reason for passing 'a to pre?  Taking those decisions the other way
gives

   {wrapAtomic: bool,
    wrapRestart: bool,
    call: unit -> 'a,
    return: 'a -> int,
    post: 'a -> 'b} -> 'b

That's seems better to me, although it still suffers from the
inexpressiveness problem (and still has one too many type variables
:-).


Overall, this seems like a classic case of currying = staged
computation, and it seems wrong not to use it.