[MLton] interrupted system call

Wed, 24 Mar 2004 10:54:08 -0800

> O.k., so that suggests something like:
> 
> Posix.Error.restart : ('a -> 'b) -> 'a -> 'b
> 
> fun restart f x =
>   (f x) handle (exn as SysError (_, SOME serr)) =>
>         if serr = intr then restart f x else raise exn
>
> (Or maybe something more primitive that directly checks the error, rather
> than wrapping something that raises SysErr.)

Yeah, this sounds a little better to me.  Maybe we can even combine
the code that raises syserr.  Something like

val Posix.Error.syscall: (unit -> int) -> unit -> int =
   fn f =>
   let
      fun loop () =
	 let
	    val n = f ()
	 in
	    if n <> ~1
	       then n
	    else
	       let
		  val e = getErrno ()
	       in
		  if e = intr orelse e = restart
		     then loop ()
		  else raiseSys e
	       end
	 end
   in
      loop
   end

I think we need to loop on restart (which we need to add to
PosixPrimitive) as well as intr, because some system calls return
that.

> In any case, since there is a loop in the CFG, the backend should put in
> the limit-check that gets triggered when signals are being handled.

Ahh.  I hadn't thought of that.  I was worried that we would have to
put in our own SML code to invoke the signal handler.  But this looks
much better.

> On the other hand, if signals aren't being handled in ML, then what
> happens to the signal?  I guess the process is terminated?

If the signal isn't handled in ML, then the handler must either be
SIG_DFL (terminate the process) or SIG_IGN (ignore the signal).

> Another issue is critical regions.  If the system call is burried down in
> the basis, and I try to wrap a higher-level function in atomicBegin/End,
> then the ML signal handler won't get run in the restart loop.  For some
> signals, maybe this is o.k. (keep making progress until we leave the
> critical region), but for the interval timer, this is bad.  Because if the
> system call is taking an appreciable about of time, then the signal is
> likely to be raised again when the call is restarted.

To be sure I understand, is the problem that the system call may never
complete because it is always interrupted by the signal?

> Maybe the right thing is to put a canHandle check in the restart loop, and
> if we are in a critical region, then just let the exception propagate.
>
> Another option might be to have the C signal handler block signals until
> the ML signal handler gets a chance to run.

While both of these make sense, I'm not yet convinced that we should
do anything to help a programmer who puts a slow system call in a
critical section.  If they want to prevent signals during the call,
they can always block them before making call.   Hmm, that makes me
think that another option is to always block signals in a critical
section, i.e. make atomicBegin block signals and atomicEnd unblock
them.  But again, the programmer has that capability now, and I am
worried about imposing policy based on a very limited understanding
and very few examples.  So maybe it's better to leave things alone.

Putting the canHandle check in the restart loop makes me nervous -- I
am worried that it makes code harder to write because then code that
makes system calls must now always handle the case that the intr
exception is raised, because it may be called from within a critical
section.  That seems to obviate most of the benefit of automatic
restarting.

> If it isn't too expensive (it very likely is), the notion of blocking until the
> ML signal handler certainly sounds like it would be the nicest choice.

Henry, what do you mean by expensive?