[MLton] switching to handler caught threads

Matthew Fluet fluet@cs.cornell.edu
Wed, 31 Mar 2004 21:23:43 -0500 (EST)


I've tracked down a problem to an interesting (and I presume buggy)
interaction between the threads caught by the signal handler and other
threads.  Note that the thread the ML signal handler gets its hands on
is built like:

fun fromPrimitive (t: Prim.thread): unit t =
   T (ref (Paused
	   (fn f => ((atomicEnd (); f ())
		     handle _ =>
			die "Asynchronous exceptions are not allowed.\n"),
	    t)))

I don't understand the purpose of the atomicEnd().  Recall the definition
of switching:

   fun ('a, 'b) atomicSwitch' (f: 'a t -> 'b t * (unit -> 'b)): 'a =
      if !switching
	 then (atomicEnd ()
	       ; raise Fail "nested Thread.switch")
      else
	 let
	    val _ = switching := true
	    val r : (unit -> 'a) ref =
	       ref (fn () => die "Thread.atomicSwitch' didn't set r.\n")
	    val t: 'a thread ref =
	       ref (Paused (fn x => r := x, Prim.current ()))
	    fun fail e = (t := Dead
			  ; switching := false
			  ; atomicEnd ()
			  ; raise e)
	    val (T t': 'b t, x: unit -> 'b) = f (T t) handle e => fail e
	    val primThread =
	       case !t' before (t' := Dead; switching := false) of
		  Dead => fail (Fail "switch to a Dead thread")
		| New g => newThread (g o x)
		| Paused (f, t) => (f x; t)
	    val _ = Prim.switchTo primThread
	    val _ = atomicEnd ()
	 in
	    !r ()
	 end

We assume that this function is called with canHandle > 0.  In particular,
it should be callable with canHandle == 1.  Now, suppose the signal
handler has caught a thread (built using fromPrimitive) and pushes it on a
ready queue.  Later, we decide to switch to that thread.  If we enter
atomicSwitch' with canHandle == 1, then as we evaluate the expression for
primThread, we're still at canHandle == 1.  But, the atomicEnd prepended
by fromPrimitive gets run under the Paused branch, dropping us to
canHandle == 0.  Now, the signal handler gets a chance to run (although it
shouldn't be running here), before we switch to primThread and before we
leave the atomicEnd.  Now, we're in a totally screwy state.  In
particular, both this atomicSwitch' and the signal handler both have a
hold of the current thread, which is bad.