[MLton] Re: exene example

Matthew Fluet fluet@cs.cornell.edu
Thu, 1 Sep 2005 09:33:54 -0400 (EDT)


>> Running with strace yields:
>> 
>> [fluet@localhost triangle]$ strace ./sources
>> execve("./sources", ["./sources"], [/* 49 vars */]) = 0
>> [ Process PID=9908 runs in 32 bit mode. ]
>> brk(0)                                  = 0x8206000
>> ...
>> write(1, "eXeneDebugLuke: making screens\n", 31eXeneDebugLuke: making screens
>> ) = 31
>> --- SIGALRM (Alarm clock) @ 0 (0) ---
>> sigreturn()                             = 31
>> rt_sigprocmask(SIG_BLOCK, [], [], 8)    = 0
>> rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
>> rt_sigprocmask(SIG_SETMASK, [], [ALRM], 8) = 0
>> write(1, "exenedebugphil:xio.getMsg start\n", 32exenedebugphil:xio.getMsg start
>> ) = 32
>> write(1, "exenedebugphil:xio.readVec start"..., 
>> 33exenedebugphil:xio.readVec start
>> ) = 33
>> write(1, "exenedebugphil:xio.readVec befor"..., 
>> 41exenedebugphil:xio.readVec before socket
>> ) = 41
>> write(1, "exenedebugphil:xio.readVec n=32\n", 32exenedebugphil:xio.readVec n=32
>> ) = 32
>> write(1, "exenedebugphil: check about to e"..., 36exenedebugphil: check about to exec
>> ) = 36
>> socketcall(0xa, 0xffffd748)             = ? ERESTARTSYS (To be restarted)
>> --- SIGALRM (Alarm clock) @ 0 (0) ---
>> sigreturn()                             = -1 EINTR (Interrupted system call)
>> rt_sigprocmask(SIG_BLOCK, [], [], 8)    = 0
>> rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
>> socketcall(0xa, 0xffffd748
>> 
>> At this point it just seems to hang again.
>> 
>> I'm not quite sure whether or not this is the right behavior.  I would 
>> think that the socket call should succeed in some fashion, but it doesn't 
>> appear to be making any progress.
>> 
>
> For debuging purposes we set the time slice to 20 seconds so that the thread 
> interleavings would not be so great, slightly easier to see what is going on 
> since eXene does use a decent amount of threads.  So if the socket call 
> blocks, it will wait for a few seconds (or 20) before context switching, 
> which is probably the behavoir you are seeing.

Unlikely.  From the strace, you can see that the socket call is 
interrupted, and then restarted with signals blocked, since the system 
call is being made in a critical section.  So, until the socket receives 
some data, the program won't make any progress.

Now, if the sender of data on that socket is another CML/eXene thread, 
then you're stuck, since no other thread will run until you leave the 
critical section.

Remember: the SML/NJ implementation of CML reimplements much of the IO 
sub-system of the Basis Library to both be thread safe and to play nice 
with switching threads.  In particular, the CML version of Socket.recvVec 
is implemented as follows:

     fun inEvt (CMLSock{sock, ...}) =
 	  CML.wrap(IOManager.ioEvt (OS.IO.pollIn (pollD sock)), ignore)

     fun recvVec (s as PS.CMLSock{sock, ...}, n) =
 	case Socket.recvVecNB (sock, n)
 	 of (SOME res) => res
 	  | NONE => (CML.sync(inEvt s); Socket.recvVec (sock, n))

So, you can see that behind the scenes, it attempts a non-blocking 
receive; if that succeeds, it simply returns that data.  If the call would 
block, then it registers the socket with the IOManager, which takes care 
of polling io descriptors and resuming a thread once the io descriptor's 
state has changed (e.g., data has appeared on the socket).  It is the 
CML.sync on the (inEvt s) that makes the recvVec yield to other threads 
until such time as the socket has data.  Note, even though recvVec may be 
considered a blocking form of read (since the call won't return until 
there is actual data on the socket), other threads get a chance to run 
because it never performs a blocking system call unless it is certain that 
the call will not actually block.

The situation is entirely different in MLton (at the current time). 
There is no $(SML_LIB)/cml/basis.mlb corresponding to a thread safe / 
non-blocking implementation of the Basis Library.  It is on the TODO, but 
we would also need to implement the IOManager and related management 
facilities.  I don't believe that portion of CML uses much compiler 
specific technology, so one might be able to port much of SML/NJ's CML 
code for this purpose.  (Though, I seem to recall the last time I looked 
at it that I thought it could be cleaned up some.)

That being said, you might be able to implement your own busy-wait socket 
library on top of the Socket structure from the Basis Libary.  For 
example, the simplest implementation would be:

   fun recvVec (sock, n) =
     case Socket.recvVecNB (sock, n) of
       SOME res => res
     | NONE => recvVec (sock, n)

However, this isn't very efficient.  A better implementation would be:

   fun recvVec (sock, n) =
     case Socket.recvVecNB (sock, n) of
       SOME res => res
     | NONE => (CML.sync(CML.timeOut(Time.fromMilliseconds 30))
                ; recvVec (sock, n))

This explicitly yields for a short ammount of time, hopefully long enough 
so that data appears on the socket.