[MLton] MacOS time profiling

Matthew Fluet fluet at tti-c.org
Mon Jul 30 13:38:17 PDT 2007


I was noticing very odd results with time profiling on 
a dual-core/dual-proc Intel MacPro machine:

fenrir:~/devel/mlton/temp/trunk fluet$ /usr/bin/time ./output1 
136.14 real       133.64 user         2.47 sys
fenrir:~/devel/mlton/temp/trunk fluet$ mlprof output1 mlmon.out
4.96 seconds of CPU time (0.00 seconds GC)
            function              cur
------------------------------- -----
StreamIOExtra.output1           35.9%
Primitive.Exn.wrapOverflow.fn   25.2%
Main.once.loop                  16.3%
Sequence.Slice.full              5.6%
UnsignedIntegralComparisons.ltu  5.2%
PosixIO.make.writeArr.fn         4.6%
Main.once                        3.4%
Primitive.Bool.not               2.4%
Sequence.updateMk                1.2%

Note that mlprof is only reporting 4.96 seconds of CPU time, while 
/usr/bin/time is reporting 136.11 seconds of CPU time.

I instrumented the profiling code to print out the current time (via 
gettimeofday) at each invocation of the SIG_PROF handler.  What's odd is 
that I see things like:

GC_handleSigProf (0x000074d3)
tv_sec = 1185827275  tv_usec = 487777
GC_handleSigProf (0x000071a4)
tv_sec = 1185827275  tv_usec = 497771
GC_handleSigProf (0x000074c0)
tv_sec = 1185827275  tv_usec = 507767
GC_handleSigProf (0x00007535)
tv_sec = 1185827275  tv_usec = 517772
GC_handleSigProf (0x000071a2)
tv_sec = 1185827285  tv_usec = 887788
GC_handleSigProf (0x0000719f)
tv_sec = 1185827285  tv_usec = 897771
GC_handleSigProf (0x000074d9)
tv_sec = 1185827285  tv_usec = 907771
GC_handleSigProf (0x0000727d)
tv_sec = 1185827285  tv_usec = 917771

The signal handler is triggered nicely at the requested 10000us interval 
for a while, but then we miss *10 seconds* worth of signals.  And this 
happens a number of times through the course of the execution.

The closest thing to an explaination that I could find was the following:

   library(timeout) does not work reliably when Mac OS X runs on a
   multi-CPU machine. In particular, timeouts tend to happen much later
   than they should. This is caused by an OS bug. One workaround is to
   disable all but one CPU using the "Processor" control in the "System
   Preferences" or the hwprefs command. These utilities are part of "CHUD"
   which can be installed as part of Apple XCode. The underlying bug is
   related to setitimer(ITIMER_VIRTUAL) and has been observed at least up
   to Mac OS X 10.4.8 (Darwin 8.8.1).

   http://www.sics.se/sicstus/docs/4.0.1/html/relnotes/Platform-specific-UNIX-notes.html

Indeed, the effect seems much less pronounced when I disable all but one 
CPU.

However, I could not find a more "official" statement of the bug. 
Searches on the Mac website just turn up the setitimer manual page.



More information about the MLton mailing list