[MLton] interrupted system call

Matthew Fluet fluet@cs.cornell.edu
Thu, 25 Mar 2004 22:54:20 -0500 (EST)


> > If I always want a chance to abort the read when a signal arrives, then I
> > need to program differently, depending on what model is being used.
>
> Agreed, although I can't think of the situation when you would always
> want the chance (and I would like to see one).

Do you want a good situation?  Most situations that I can think of (e.g.,
a multithreaded network app reading from the network) can all be argued
against by using a non-blocking operation.

Ah, here's a better example: the poll system call.  poll can error
with EINTR.  Now the Basis Library spec says "The poll function will raise
OS.SysErr if, for example, one of the file descriptors refers to a closed
file.", so it may be reasonable for it to raise OS.SysErr on EINTR.  But,
we could also restart it.  By it's very nature, poll is just the type of
thing we would like to interrupt and let a signal handler run, before
blocking signals and restarting the signal handler.

> > So, I still claim that with 2abc, you need to program along A.
>
> Agreed.  How about the following tweak?
>
> In the case where one wants to abort the read, it does seem simpler
> programming to do it via an exception handler rather than a signal
> handler.  If we need this, perhaps we could add a bool ref that lets
> one control the restarting behavior of system calls.  Of course, we'd
> want to make sure (via fluidLet) to set the flag correctly in the
> places in the basis library that assume the signal is restarted.  With
> the flag you can use A, which is simpler, most of the time, and use B
> when you need it.

That seems reasonable.

> > What I'm worried about is a system call that needs this kind of pre- or
> > post- processing that can error with EINTR.  (I don't know if we have any,
> > but it seems likely.)
>
> It does seem likely.  But it seems like the critical section works
> fine in that case.  It will cause the system call to be restarted
> until it finishes.

I'd actually suggest that if the system call returns an error, then the
critical section is left, so that the alrm signal gets a chance to run
before retarting the system call.  Of course, this assumes a few things
about when the limit checks get inserted.  Namely, that if I have a CFG
like:

L1: ...
    atomicBegin ()
    ...
    IF _ THEN L2 ELSE L3

L2: ...
    atomicEnd ()
    ...
    GOTO L1

Then I'm assuming that the limit check gets inserted at L1, and not L2
(even though L2 breaks the loop).  Maybe that's fair.  An alternative
would be to add a primitive that enters the runtime (and may switch
threads) which invokes the signal handler (respecting canHandle).  I can
imagine situations where I'd like an atomicEnd that not only left the
critical region, but immediately invoked the signal handler if a signal
were pending.

> > Because if you block all signals, you're beholden to unblock all signals
> > when the ML signal handler gets a chance to run.
>
> No.  You can save the mask and then restore it.
>
>  let
>     val m = Signal.Mask.getBlocked ()     (* not there yet *)
>     val () = Signal.Mask.block Signal.mask.all
>  in
>     dynamicWind (fn () => call raiseSys,
>                  fn () => Signal.Mask.setBlocked m)
>  end

Well, sure, it's easy if you change the Signal interface. ;-)

It also doesn't seem to capture the property that I proposed, which was
that the signals were blocked until the signal handler got a chance to
run.  If the system call returns normally the next time around, then I
still might reach the Signal.Mask.setBlocked before the signal handler is
invoked.  By a similar argument, though, we might reach a user's
Signal.Mask.block which blocks the signal which aborted the system call
(and we blocked), _before_ invoking the ML signal handler, which would
unblock the signal.  Also, not that if the code above appears outside a
critical section, then we might block all, loop on the call, invoke the
signal handler, and switch threads, leaving all signals blocked until (if
ever), we return to this thread, and finish the dynamicWind and its
setBlocked.

So, I think the right thing to do is

val Posix.Error.syscall: (unit -> int) -> int =
   fn f =>
   let
      fun call (err: int -> int): int =
         let
            val n = f ()
         in
            if n = ~1 then err (getErrno ()) else n
         end
      fun err (e: int): int =
         if !Signal.restart andalso (e = intr orelse e = restart)
            then if canHandle > 0
                    then
                    let
                       val m = Signal.Mask.getBlocked ()
                       val () = Signal.Mask.block Signal.Mask.all
                    in
                       dynamicWind (fn () => call raiseSys,
                                    fn () => Signal.Mask.setBlocked m)
                    end
                 else call err
         else raiseSys e
   in
      call err
   end

Essentially, we want to make progress.  The system call returning without
error is progress.  Raising the SysError exception is progress.  If we are
outside a critical region, then handling the signal is progress, and we
don't want to restart the system call with signals blocked, because we
might return to this thread long after the system call was interrupted.
If we are inside a critical region, then we need the system call to
complete to make progress, so we block signals and restart the system
call.  Since we are inside a critical region (and not executing any user
code), the set of signals blocked can't be changed until the dynamic wind
returns, restoring the signals.