[MLton-devel] cvs commit: profiler improvements

Stephen Weeks sweeks@users.sourceforge.net
Mon, 07 Jul 2003 15:50:29 -0700


sweeks      03/07/07 15:50:29

  Modified:    doc      changelog
               doc/examples/profiling Makefile list-rev.sml tak.sml
               doc/user-guide profiling.tex
               include  main.h
               man      mlprof.1
               mlprof   main.sml mlprof-stubs.cm mlprof.cm
               mlton    mlton-stubs-1997.cm mlton-stubs.cm mlton.cm
               mlton/backend backend.fun machine.fun machine.sig
                        profile.fun profile.sig
               mlton/codegen/c-codegen c-codegen.fun
               mlton/codegen/x86-codegen x86-codegen.fun
               mlton/control control.sig control.sml
               mlton/main main.sml
               runtime  gc.c gc.h
               runtime/basis/MLton profile.c
  Log:
  Added improvements discussed with Joe Hurd back in April.  The idea is
  to shift -profile-split from compile time to mlprof time to allow much
  faster user experimentation with splits and to create more precise
  call graph information.  This is done by splitting everything at
  compile time and keeping two sets of profiling data at run time, one
  for the master versions of functions and one for the split copies.
  Then, mlprof uses the split copies to build a precise call graph, and
  merges nodes at the last possible minute, after the -keep expression
  has been evaluated.
  
  More specifically:
     Eliminated mlton -profile-split
     Added mlprof -split
     Changed mlprof -graph to -keep
     Eliminated mlprof -ignore
  
  Now, the behavior of mlprof -keep is
  1. Build the precise call graph using all the split copies of
     functions.  Functions with only one version have one split copy.
  2. Evaluate the set of nodes corresponding to -keep.  The only
     trickiness is in evaluating thresh:
  
        (thresh x) = { f | f is -split and the ticks for this split
  				version of f >= x }
                     U { f | f is not -split and the ticks for the
  				unsplit version of f >= x }
     The rule for thresh-gc and thresh-stack is similar
  3. For functions that are not -split, merge the split nodes.
  4. Add a dotted edge from A to B if there is a path in the precise
     call graph from A to B going through nodes that don't correspond to
     a node in the merged graph.
  
  Changed the behavior of -keep so that it controls which functions are
  displayed in the table too.
  
  Changed the behavior of -thresh x, which is now an abbreviation for
  -keep '(thresh x)'.
  
  Added some examples of lexical nesting to the profiling section of the
  user guide.
  
  Updated the mlprof man page.

Revision  Changes    Path
1.51      +11 -0     mlton/doc/changelog

Index: changelog
===================================================================
RCS file: /cvsroot/mlton/mlton/doc/changelog,v
retrieving revision 1.50
retrieving revision 1.51
diff -u -r1.50 -r1.51
--- changelog	7 Jul 2003 15:23:44 -0000	1.50
+++ changelog	7 Jul 2003 22:50:27 -0000	1.51
@@ -3,6 +3,17 @@
 At this point, the only missing basis library function is "use".
 
 * 2003-07-07
+  - Profiling improvements
+    o Eliminated mlton -profile-split.  Added mlprof -split.  Now the
+      profiling infrastructure keeps track of the splits and allows
+      one to decide which splits to make (if any) when mlprof is run,
+      which is much better than having to decide at compile time.
+    o Changed mlprof -graph to mlprof -keep, and changed the behavior
+      so that -keep also controls which functions are displayed in the
+      table.
+    o Eliminated mlprof -ignore: it's behavior is now subsumed by
+      -keep, whose meaning has changed to be more like -ignore on
+      nodes that are not kept.
   - When calling gcc for linking, put -link args in same order as they
     appeared on the MLton command line (they used to be reversed).
 



1.14      +2 -2      mlton/doc/examples/profiling/Makefile

Index: Makefile
===================================================================
RCS file: /cvsroot/mlton/mlton/doc/examples/profiling/Makefile,v
retrieving revision 1.13
retrieving revision 1.14
diff -u -r1.13 -r1.14
--- Makefile	2 Jun 2003 23:36:28 -0000	1.13
+++ Makefile	7 Jul 2003 22:50:27 -0000	1.14
@@ -15,7 +15,7 @@
 profile-alloc:
 	$(mlton) -profile alloc $(ALLOC_EX).sml
 	./$(ALLOC_EX)
-	$(mlprof) -show-line true -thresh 1 $(ALLOC_EX) mlmon.out
+	$(mlprof) -show-line true $(ALLOC_EX) mlmon.out
 
 .PHONE: profile-multiple
 profile-multiple:
@@ -29,7 +29,7 @@
 profile-stack:
 	$(mlton) -profile alloc -profile-stack true $(ALLOC_EX).sml
 	./$(ALLOC_EX)
-	$(mlprof) -show-line true -thresh 1 $(ALLOC_EX) mlmon.out
+	$(mlprof) -show-line true $(ALLOC_EX) mlmon.out
 	dot $(ALLOC_EX).dot >$(ALLOC_EX).ps
 
 .PHONY: profile-time



1.2       +1 -1      mlton/doc/examples/profiling/list-rev.sml

Index: list-rev.sml
===================================================================
RCS file: /cvsroot/mlton/mlton/doc/examples/profiling/list-rev.sml,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -r1.1 -r1.2
--- list-rev.sml	13 Jan 2003 07:53:44 -0000	1.1
+++ list-rev.sml	7 Jul 2003 22:50:27 -0000	1.2
@@ -8,5 +8,5 @@
       [] => []
     | x :: l => append (rev l, [x])
 
-val l = List.tabulate (100, fn i => i)
+val l = List.tabulate (1000, fn i => i)
 val _ = 1 + hd (rev l)



1.3       +21 -15    mlton/doc/examples/profiling/tak.sml

Index: tak.sml
===================================================================
RCS file: /cvsroot/mlton/mlton/doc/examples/profiling/tak.sml,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -r1.2 -r1.3
--- tak.sml	14 Jan 2003 06:37:21 -0000	1.2
+++ tak.sml	7 Jul 2003 22:50:27 -0000	1.3
@@ -1,20 +1,26 @@
-fun tak1 (x, y, z) =
-   if y >= x
-      then z
-   else
-      tak1 (tak2 (x - 1, y, z),
-	    tak2 (y - 1, z, x),
-	    tak2 (z - 1, x, y))
-and tak2 (x, y, z) =
-   if y >= x
-      then z
-   else
-      tak1 (tak2 (x - 1, y, z),
-	    tak2 (y - 1, z, x),
-	    tak2 (z - 1, x, y))
+structure Tak =
+   struct
+      fun tak1 (x, y, z) =
+	 let
+	    fun tak2 (x, y, z) =
+	       if y >= x
+		  then z
+	       else
+		  tak1 (tak2 (x - 1, y, z),
+			tak2 (y - 1, z, x),
+			tak2 (z - 1, x, y))
+	 in
+	    if y >= x
+	       then z
+	    else
+	       tak1 (tak2 (x - 1, y, z),
+		     tak2 (y - 1, z, x),
+		     tak2 (z - 1, x, y))
+	 end
+   end
 
 val rec f =
    fn 0 => ()
-    | n => (tak1 (18, 12, 6) ; f (n-1))
+    | n => (Tak.tak1 (18, 12, 6) ; f (n-1))
 
 val _ = f 5000



1.28      +109 -143  mlton/doc/user-guide/profiling.tex

Index: profiling.tex
===================================================================
RCS file: /cvsroot/mlton/mlton/doc/user-guide/profiling.tex,v
retrieving revision 1.27
retrieving revision 1.28
diff -u -r1.27 -r1.28
--- profiling.tex	23 Jun 2003 00:22:09 -0000	1.27
+++ profiling.tex	7 Jul 2003 22:50:27 -0000	1.28
@@ -9,7 +9,7 @@
 When the program finishes, it automatically writes the counts to an
 {\tt mlmon.out} file.  You can then run {\tt mlprof} on the executable
 and the {\tt mlmon.out} file to see the percentage of the total
-(allocation or time) spent in each functions.
+(allocation or time) spent in each function.
 
 Here is an example of time profiling, run from within the {\tt
 examples/profiling} directory.
@@ -17,41 +17,41 @@
 % mlton -profile time tak.sml
 % ./tak
 % mlprof tak mlmon.out
-6.36 seconds of CPU time (0.0 seconds GC)
-function  cur 
--------- -----
-tak2     77.2%
-tak1     22.8%
-<gc>      0.0%
+6.00 seconds of CPU time (0.00 seconds GC)
+function     cur 
+------------- -----
+Tak.tak1.tak2 75.8%
+Tak.tak1      24.2%
 \end{verbatim}
 This is a contrived example with two mutually recursive copies of the
-{\tt tak} function.  The profiling shows us that roughly
-three-quarters of the time is spent in the {\tt tak2} function, while
-the rest is spent in {\tt tak1}.  There is a negligible amount of time
-spent in gc.
+{\tt tak} function, defined within the {\tt Tak} structure.  This
+examples shows how {\tt mlprof} indicates lexical nesting via a
+sequence of period-separated names indicating the structures and
+functions in which a function definition is nested.  The profiling
+data shows us that roughly three-quarters of the time is spent in the
+{\tt Tak.tak1.tak2} function, while the rest is spent in {\tt
+Tak.tak1}.
 
 You can display raw counts in addition to percentages with {\tt -raw
 true}.
 \begin{verbatim}
 % mlprof -raw true tak mlmon.out
-6.36 seconds of CPU time (0.0 seconds GC)
-function  cur    raw  
--------- ----- -------
-tak2     77.2% (4.91s)
-tak1     22.8% (1.45s)
-<gc>      0.0%  (0.0s)
+6.00 seconds of CPU time (0.00 seconds GC)
+  function     cur    raw  
+------------- ----- -------
+Tak.tak1.tak2 75.8% (4.55s)
+Tak.tak1      24.2% (1.45s)
 \end{verbatim}
 
 You can display the filename and line numbers for functions in addition
 to their names with {\tt -show-line true}.
 \begin{verbatim}
 % mlprof -show-line true tak mlmon.out
-6.36 seconds of CPU time (0.0 seconds GC)
-   function      cur 
---------------- -----
-tak2 tak.sml: 8 77.2%
-tak1 tak.sml: 1 22.8%
-<gc>             0.0%
+6.00 seconds of CPU time (0.00 seconds GC)
+        function           cur 
+------------------------- -----
+Tak.tak1.tak2  tak.sml: 5 75.8%
+Tak.tak1  tak.sml: 3      24.2%
 \end{verbatim}
 
 Allocation profiling is very similar to time profiling.  Here is an
@@ -60,42 +60,27 @@
 \begin{verbatim}
 % mlton -profile alloc list-rev.sml
 % ./list-rev
-% mlprof -show-line true -thresh 1 list-rev mlmon.out
-63,144 bytes allocated (4,632 bytes by GC)
+% mlprof -show-line true list-rev mlmon.out
+6,030,136 bytes allocated (108,336 bytes by GC)
        function          cur 
 ----------------------- -----
-append  list-rev.sml: 1 87.6%
-<gc>                     6.8%
-<main>                   3.8%
-rev  list-rev.sml: 6     1.8%
+append  list-rev.sml: 1 97.6%
+<gc>                     1.8%
+<main>                   0.4%
+rev  list-rev.sml: 6     0.2%
 \end{verbatim}
 
-The data shows that most of the allocation is done by the append
-function, defined on line 1 of {\tt list-rev.sml}.  The allocation by
+The data shows that most of the allocation is done by the {\tt append}
+function, defined on line 1 of {\tt list-rev.sml}.  The table also
+shows how special functions like {\tt gc} and {\tt main} are handled:
+they are printed with surrounding brackets.  C functions are displayed
+similarly.  Finally, as is usually the case, the allocation done by
 the garbage collector is due to it growing the stack.
-%Basis library
-%functions are displayed with the {\tt <basis>} prefix.
-The example
-also shows how to filter out functions below a certain percentage with
-{\tt -thresh}.
 
 Time profiling typically has a very small performance impact.
 However, the performance impact of allocation profiling is noticeable,
 because it inserts additional C calls for object allocation.
 
-To make it easier to identify a function, {\tt mlprof} shows lexical
-nesting via a sequence of period-separated names indicating the
-structures and functions in which the function definition is nested.
-In all of the above examples, the functions were defined at the top
-level and so had a single name.  As an example of nesting, {\tt g} in
-the following code would appear as {\tt S.f.g}.
-\begin{verbatim}
-structure S =
-   struct
-       fun f = ... fun g ...
-   end
-\end{verbatim}
-
 \subsection{Profiling the stack}
 
 For both allocation and time profiling, you can use {\tt
@@ -105,21 +90,20 @@
 \begin{verbatim}
 % mlton -profile alloc -profile-stack true list-rev.sml
 % ./list-rev
-% mlprof -show-line true -thresh 1 list-rev mlmon.out
-63,144 bytes allocated (4,632 bytes by GC)
+% mlprof -show-line true list-rev mlmon.out
+6,030,136 bytes allocated (108,336 bytes by GC)
        function          cur  stack  GC 
 ----------------------- ----- ----- ----
-append  list-rev.sml: 1 87.6% 87.6% 3.1%
-<main>                   3.8% 93.2% 6.0%
-rev  list-rev.sml: 6     1.8% 87.7% 5.8%
+append  list-rev.sml: 1 97.6% 97.6% 1.4%
+<gc>                     1.8%  0.0% 1.8%
+<main>                   0.4% 98.2% 1.8%
+rev  list-rev.sml: 6     0.2% 97.6% 1.8%
 \end{verbatim}
 
 In the above table, we see that {\tt rev}, defined on line 6 of {\tt
-list-rev.sml}, is on the stack while 87.6\% of the allocation is done
-by the user program and while 3.1\% of the allocation is done by the
-garbage collector.  The above table also shows how special functions
-like {\tt main} are handled: they are printed with surrounding
-brackets.  C functions are displayed similarly.
+list-rev.sml}, is on the stack while 97.6\% of the allocation is done
+by the user program and while 1.8\% of the allocation is done by the
+garbage collector. 
 
 The performance impact of {\tt -profile-stack true} can be noticeable
 since there is some extra bookkeeping at every nontail call.
@@ -127,25 +111,30 @@
 \subsection{Call graphs}
 
 For easier visualization of profiling data, {\tt mlprof} creates a
-call graph of the program in dot format.  The graph nodes contain the
-function name (and source position with {\tt -show-line true}), as
-well as the percentage of ticks.  You can create a postscript graph
-from the dot file using the
+call graph of the program in dot format, from which you can create a
+postscript graph using the
 \htmladdnormallink{{\tt graphviz}}
 		  {http://www.research.att.com/sw/tools/graphviz/}
-software package.
+software package.  For example, {\tt mlprof foo mlmon.out} will create
+{\tt foo.dot} with a complete call graph.  For each source function,
+there will be one node in the graph that contains the function name
+(and source position with {\tt -show-line true}), as well as the
+percentage of ticks.  If you want to create a call graph for your
+program without any profiling data, you can simply call {\tt mlprof}
+without any {\tt mlmon.out} files, e.g. {\tt mlprof foo}.
 
 Because SML has higher-order functions, the call graph is is dependent
-on {\mlton}'s analysis of where functions can be called.  This
+on {\mlton}'s analysis of which functions call each other.  This
 analysis depends on many implementation details and might display
 spurious edges that a human could conclude are impossible.  However,
 in practice, the call graphs tend to be very accurate.
 
-Because call graphs can get big, you may want to control what nodes
-appear in the graph.  For this, you can use the {\tt -graph} option to
-specify the nodes that you would like to see.  The argument to {\tt
--graph} is an expression that describes a set of nodes, taken from the
-following grammar.
+Because call graphs can get big, {\tt mlprof} provides the {\tt -keep}
+option to specify the functions that you would like to see.  This
+option also controls which functions appear in the table that {\tt
+mlprof} prints.  The argument to {\tt -keep} is an expression
+describing a set of source functions (i.e. graph nodes), taken from
+the following grammar.
 
 \begin{latexonly}
 \begin{center}
@@ -184,68 +173,45 @@
 \end{center}
 \end{htmlonly}
 
-In the grammar, {\tt all} denotes the set of all nodes.  In {\tt "{\it
-s}"}, {\it s} is a regular expression denoting the set of nodes whose
-function name matches {\it s}.  The regexp must match the entire
-function name, including the source position if you use {\tt
--show-line true}.  So, with {\tt -show-line false}, you would use {\tt
-"foo"}, but with {\tt -show-line true}, you would use {\tt "foo .*"}.
-
-The {\tt and}, {\tt not}, and {\tt or} expressions denote
-intersection, complement, and union, respectively.  The {\tt pred} and
-{\tt succ} expressions add the set of immediate predecessors or
-successors to their argument, respectively.  The {\tt from} and {\tt
-to} expressions denote the set of nodes that have paths from or to the
-set of nodes denoted by their arguments, respectively.  Finally, {\tt
-thresh}, {\tt thresh-gc}, and {\tt thresh-stack} denote the set of
-nodes whose percentage of ticks, gc ticks, or stack ticks,
-respectively, is greater than or equal to the real number {\it x}.
-
-For example, if you want to see the entire call-graph for a program,
-you can use {\tt -graph all}.  If you want to see all nodes reachable
-from function {\tt foo} in your program, you would use {\tt -graph
-'(from "foo")'}.  Or, if you want to see all the functions defined in
-subdirectory {\tt bar} of your project that used at least 1\% of the
-ticks, you would use {\tt -show-line true -graph '(and ".*/bar/.*"
-(thresh 1.0))'}.
-
-When compiling with {\tt -profile-stack false}, the default is {\tt
--graph '(to (thresh {\it x}))'} where {\it x} is the threshold.  When
-compiling with {\tt -profile-stack true}, the default is {\tt -graph
-'(thresh-stack {\it x})'} where {\it x} is the threshold.
+In the grammar, {\tt all} denotes the set of all nodes.  {\tt "{\it
+s}"} is a regular expression denoting the set of functions whose name
+has a prefix matchin the regexp.  The {\tt and}, {\tt not}, and {\tt
+or} expressions denote intersection, complement, and union,
+respectively.  The {\tt pred} and {\tt succ} expressions add the set
+of immediate predecessors or successors to their argument,
+respectively.  The {\tt from} and {\tt to} expressions denote the set
+of nodes that have paths from or to the set of nodes denoted by their
+arguments, respectively.  Finally, {\tt thresh}, {\tt thresh-gc}, and
+{\tt thresh-stack} denote the set of nodes whose percentage of ticks,
+gc ticks, or stack ticks, respectively, is greater than or equal to
+the real number {\it x}.
+
+For example, if you want to see the entire call graph for a program,
+you can use {\tt -keep all} (this is the default).  If you want to see
+all nodes reachable from function {\tt foo} in your program, you would
+use {\tt -keep '(from "foo")'}.  Or, if you want to see all the
+functions defined in subdirectory {\tt bar} of your project that used
+at least 1\% of the ticks, you would use {\tt -keep '(and ".*/bar/"
+(thresh 1.0))'}.  To see all functions with ticks above a threshold,
+you can also use {\tt -thresh x}, which is an abbreviation for {\tt
+-keep '(thresh x)'}.  You can not use multiple {\tt -keep} arguments
+or both {\tt -keep} and {\tt -thresh}.  When you use {\tt -keep} to
+display a subset of the functions, {\tt mlprof} will add dotted edges
+to the call graph to indicate a path in the original call graph from
+one function to another.
+
+When compiling with {\tt -profile-stack true}, you can use {\tt mlprof
+-gray true} to make the nodes darker or lighter depending on whether
+their stack percentage is higher or lower.
 
 {\mlton}'s optimizer may duplicate source functions for any of a
 number of reasons (functor duplication, monomorphisation,
-polyvariance, inlining).  By default, duplicates arising from functor
-duplication are treated as different functions, while duplicates
-arising from other optimizations are treated as the same function.  If
-you would like all the copies of a function to be treated as
-different, you can use {\tt -profile-split}.
-
-Another {\tt mlprof} option to improve the readability of call graphs
-is {\tt -ignore}, which takes a regexp specifying functions to ignore
-when creating the call graph.  For example, suppose you define a
-library function like {\tt o} (compose), and then define {\tt fun f x
-= (g o h) x}.  Then the call graph would contain edges from {\tt f} to
-{\tt o}, from {\tt o} to {\tt g}, and from {\tt o} to {\tt h}.  You
-might prefer to see only edges from {\tt f} to {\tt g} and from {\tt
-f} to {\tt h}.  To do this, you can pass {\tt -ignore 'o'} to {\tt
-mlprof}, which will cause it to remove {\tt o} and connect its
-predecessors and successors in the call graph.  Note that this is
-different than using {\tt -graph (not "o")}, which would only remove
-the node from the graph, leaving it disconnected.  Also, as with {\tt
--graph}, the regexp must match the entire function name, including the
-source position if you use {\tt -show-line true}.  You will also
-likely want to use {\tt -profile-split} to treat all copies of compose
-as different functions.
-
-Technically speaking, the graph is a call-stack graph rather than a
-call graph because it describes the set of possible call stacks.  The
-difference is in how tail calls are displayed.  For example if {\tt f}
-nontail calls {\tt g} and {\tt g} tail calls {\tt h}, then the
-call-stack graph has edges from {\tt f} to {\tt g} and {\tt f} to {\tt
-h}, while the call-graph has edges from {\tt f} to {\tt g} and {\tt g}
-to {\tt h}.
+polyvariance, inlining).  By default, all duplicates of a function are
+treated as one.  If you would like to treat the duplicates separately,
+you can use {\tt mlprof -profile-split <regexp>}, which will cause all
+duplicates of functions whose name has a prefix matching the regular
+expression to be treated separately.  This can be especially useful
+for higher-order utility functions like {\tt General.o}.
 
 \subsection{Using {\tt MLton.Profile}}
 
@@ -266,30 +232,34 @@
 % mlton -profile time fib-tak.sml
 % ./fib-tak
 % mlprof fib-tak mlmon.fib.out
-5.67 seconds of CPU time (0.0 seconds GC)
 function   cur 
 --------- -----
-fib       96.8%
-<unknown>  3.2%
-<gc>       0.0%
+fib       96.9%
+<unknown>  3.1%
 % mlprof fib-tak mlmon.tak.out
-0.72 seconds of CPU time (0.0 seconds GC)
+0.68 seconds of CPU time (0.00 seconds GC)
 function  cur  
 -------- ------
 tak      100.0%
-<gc>       0.0%
 % mlprof fib-tak mlmon.fib.out mlmon.tak.out mlmon.out
-6.39 seconds of CPU time (0.0 seconds GC)
+6.45 seconds of CPU time (0.00 seconds GC)
 function   cur 
 --------- -----
-fib       85.9%
-tak       11.3%
+fib       86.7%
+tak       10.5%
 <unknown>  2.8%
-<gc>       0.0%
 \end{verbatim}
 
 \subsection{Profiling details}
 
+Technically speaking, {\tt mlprof} produces a call-stack graph rather
+than a call graph, because it describes the set of possible call
+stacks.  The difference is in how tail calls are displayed.  For
+example if {\tt f} nontail calls {\tt g} and {\tt g} tail calls {\tt
+h}, then the call-stack graph has edges from {\tt f} to {\tt g} and
+{\tt f} to {\tt h}, while the call-keep has edges from {\tt f} to {\tt
+g} and {\tt g} to {\tt h}.
+
 Conceptually, both allocation and time profiling work in the same way.
 The compiler produces information that maps machine code positions to
 source functions that the profiler uses while the program is running
@@ -319,9 +289,5 @@
 
 There may be a few missed clock ticks or bytes allocated at the
 very end of the program after the data is written.
-
-For both forms of profiling, if your program calls {\tt
-Posix.Process.exit}, you will bypass the code responsible for writing
-out the profiling data and thus get no {\tt mlmon.out} file.
 
 Profiling has not been tested with threads.



1.5       +5 -4      mlton/include/main.h

Index: main.h
===================================================================
RCS file: /cvsroot/mlton/mlton/include/main.h,v
retrieving revision 1.4
retrieving revision 1.5
diff -u -r1.4 -r1.5
--- main.h	23 Jun 2003 04:58:54 -0000	1.4
+++ main.h	7 Jul 2003 22:50:27 -0000	1.5
@@ -39,14 +39,15 @@
 	gcState.objectTypes = objectTypes;				\
 	gcState.objectTypesSize = cardof(objectTypes);			\
 	gcState.profileStack = ps;					\
+	gcState.saveGlobals = saveGlobals;				\
 	gcState.sourceLabels = sourceLabels;				\
 	gcState.sourceLabelsSize = cardof(sourceLabels);		\
-	gcState.saveGlobals = saveGlobals;				\
-	gcState.sources = sources;					\
-	gcState.sourcesSize = cardof(sources);				\
+	gcState.sourceNames = sourceNames;				\
+	gcState.sourceNamesSize = cardof(sourceNames);			\
 	gcState.sourceSeqs = sourceSeqs;				\
 	gcState.sourceSeqsSize = cardof(sourceSeqs);			\
-	gcState.sourceSuccessors = sourceSuccessors;			\
+	gcState.sources = sources;					\
+	gcState.sourcesSize = cardof(sources);				\
 	gcState.stringInits = stringInits;				\
 	gcState.stringInitsSize = cardof(stringInits);			\
 	MLton_init (argc, argv, &gcState);				\



1.17      +11 -17    mlton/man/mlprof.1

Index: mlprof.1
===================================================================
RCS file: /cvsroot/mlton/mlton/man/mlprof.1,v
retrieving revision 1.16
retrieving revision 1.17
diff -u -r1.16 -r1.17
--- mlprof.1	12 Mar 2003 20:35:47 -0000	1.16
+++ mlprof.1	7 Jul 2003 22:50:28 -0000	1.17
@@ -1,14 +1,13 @@
-.TH mlprof 1 "March 12, 2003"
+.TH mlprof 1 "July 7, 2003"
 .SH NAME
-\fBmlprof\fP \- display profiling information from MLton-compiled executable
+\fBmlprof\fP \- display profiling information for a MLton-compiled executable
 .SH SYNOPSIS
 \fBmlprof \fI[option ...] a.out [mlmon.out ...]\fR
 .SH DESCRIPTION
 .PP
 \fBmlprof\fP extracts information from an executable compiled by
-\fBMLton\fP with \fB-profile alloc\fP or \fB-profile time\fP option
-and the resulting \fBmlmon.out\fP file produced by running the
-executable.
+\fBMLton\fP with \fB-profile alloc\fP or \fB-profile time\fP and the
+resulting \fBmlmon.out\fP file produced by running the executable.
 
 The output of \fBmlprof\fP consists of an initial line indicating the
 total amount of CPU time or bytes allocated.  After this, source
@@ -27,10 +26,6 @@
 
 .SH OPTIONS
 .TP
-\fB-graph \fI{\fBexp\fP}\fP
-Show nodes in the call graph specified by \fBexp\fP.  For details on
-the allowed expressions, see the \fBMLton User Guide\fP.
-.TP
 \fB-graph-title \fIstring\fP
 Set the call-graph title.
 .TP
@@ -38,12 +33,10 @@
 Gray call-graph nodes according to stack %.  This only makes sense if
 the executable was compiled \fB-profile-stack true\fP.
 .TP
-\fB-ignore \fIregexp\fP
-Ignore functions whose name matches \fBregexp\fP.  With \fB-show-line
-true\fP, the function name is followed by a space and the file
-position, so you may need to put \fB .*\fP at the end of the regexp.
-Multiple \fB-ignore\fPs are allowed, and will be or-ed together as
-regexps.
+\fB-keep \fI{\fBexp\fP}\fP
+Only show functions specified by \fBexp\fP.  For details on the
+allowed expressions, see the \fBMLton User Guide\fP.  Multiple
+\fB-keep\fP expressions are not allowed.
 .TP
 \fB-mlmon \fIfile\fP
 Process the list of whitespace-separated \fBmlmon.out\fP files found in the
@@ -56,8 +49,9 @@
 Show the file and line for each function.
 .TP
 \fB-thresh \fIx\fP
-Only print information about functions whose percentage is above
-\fBx\fP, where 0 <= \fBx\fP <= 100.0.
+An abbreviation for \fB-keep '(thresh x)'\fP, which only print
+information about functions whose percentage is greater than or equal
+to \fBx\fP, where 0 <= \fBx\fP <= 100.0. 
 .TP
 \fB-tolerant \fI{\fBfalse\fP|\fBtrue\fP}\fP
 Print a warning about broken mlmon files, but do not exit.



1.50      +396 -237  mlton/mlprof/main.sml

Index: main.sml
===================================================================
RCS file: /cvsroot/mlton/mlton/mlprof/main.sml,v
retrieving revision 1.49
retrieving revision 1.50
diff -u -r1.49 -r1.50
--- main.sml	3 Jun 2003 01:58:56 -0000	1.49
+++ main.sml	7 Jul 2003 22:50:28 -0000	1.50
@@ -16,12 +16,11 @@
 val sourcesIndexGC: int = 1
 
 val gray: bool ref = ref false
-val ignore: Regexp.t ref = ref Regexp.none
 val longName: bool ref = ref true
 val mlmonFiles: string list ref = ref []
 val raw = ref false
 val showLine = ref false
-val thresh: real ref = ref 0.0
+val splitReg: Regexp.t ref = ref Regexp.none
 val title: string option ref = ref NONE
 val tolerant: bool ref = ref false
 
@@ -34,6 +33,11 @@
 
       fun toString n =
 	 case n of
+	    NamePos {name, pos} => concat [name, "  ", pos]
+	  | Simple s => s
+
+      fun toStringMaybeLine n =
+	 case n of
 	    NamePos {name, pos} =>
 	       if !showLine
 		  then concat [name, "  ", pos]
@@ -67,6 +71,10 @@
 	       else [(name, Dot.Center)]
 	  | Simple s =>
 	       [(s, Dot.Center)]
+
+      val isGC =
+	 fn Simple "<gc>" => true
+	  | _ => false
    end
 
 structure Graph = DirectedGraph
@@ -76,21 +84,32 @@
    structure Edge = Edge
    structure Node = Node
 end
+local
+   open Dot
+in
+   structure EdgeOption = EdgeOption
+   structure NodeOption = NodeOption
+end
 
 structure AFile =
    struct
       datatype t = T of {callGraph: unit Graph.t,
 			 magic: word,
+			 master: {isSplit: bool,
+				  source: Source.t} vector,
 			 name: string,
-			 sources: {node: unit Node.t,
-				   source: Source.t} option vector}
+			 split: {masterIndex: int,
+				 node: unit Node.t} vector}
 
-      fun layout (T {magic, name, sources, ...}) =
+      fun layout (T {magic, name, master, ...}) =
 	 Layout.record
 	 [("name", String.layout name),
 	  ("magic", Word.layout magic),
-	  ("sources",
-	   Vector.layout (Option.layout (Source.layout o #source)) sources)]
+	  ("master",
+	   Vector.layout (fn {isSplit, source} =>
+			  Layout.record [("isSplit", Bool.layout isSplit),
+					 ("source", Source.layout source)])
+	   master)]
 
       fun new {afile: File.t}: t =
 	 if not (File.doesExist afile)
@@ -104,73 +123,73 @@
 	     let
 		fun line () = In.inputLine ins
 		val magic = valOf (Word.fromString (line ()))
-		val sourcesLength = valOf (Int.fromString (line ()))
+		fun vector (f: string -> 'a): 'a vector =
+		   Vector.tabulate (valOf (Int.fromString (line ())),
+				    fn _ => f (line ()))
+		val rc = Regexp.compileNFA (!splitReg)
+		val master =
+		   vector
+		   (fn s =>
+		    let
+		       val source = Source.fromString (String.dropSuffix (s, 1))
+		       val isSplit =
+			  Regexp.Compiled.matchesPrefix
+			  (rc, Source.toString source)
+		    in
+		       {isSplit = isSplit,
+			source = source}
+		    end)
 		val _ =
-		   if 0 = sourcesLength
-		      then Error.bug "doesn't appear to be compiled for profiling"
+		   if 0 = Vector.length master
+		      then
+			 Error.bug "doesn't appear to be compiled for profiling"
 		   else ()
-		val graph = Graph.new ()
-		val {get = nodeIndex, set = setNodeIndex, ...} =
-		   Property.getSetOnce
-		   (Node.plist, Property.initRaise ("index", Node.layout))
 		val sources =
-		   Vector.tabulate
-		   (sourcesLength, fn i =>
+		   vector
+		   (fn s =>
+		    case String.tokens (s, Char.isSpace) of
+		       [masterIndex, successorsIndex] =>
+			  {masterIndex = valOf (Int.fromString masterIndex),
+			   successorsIndex = valOf (Int.fromString
+						    successorsIndex)}
+		     | _ => Error.bug "AFile.new")
+		val sourceSeqs =
+		   vector
+		   (fn s =>
+		    Vector.fromListMap
+		    (String.tokens (s, Char.isSpace), fn s =>
+		     valOf (Int.fromString s)))
+		val graph = Graph.new ()
+		val split =
+		   Vector.mapi
+		   (sources, fn (i, {masterIndex, ...}) =>
 		    let
 		       val n = Graph.newNode graph
-		       val _ = setNodeIndex (n, i)
 		    in
-		       {node = n,
-			source = (Source.fromString
-				  (String.dropSuffix (line (), 1)))}
+		       {masterIndex = masterIndex,
+			node = n}
 		    end)
 		val _ =
-		   Int.for
-		   (0, sourcesLength, fn i =>
-		    let
-		       val from = #node (Vector.sub (sources, i))
-		    in
-		       List.foreach
-		       (String.tokens (line (), Char.isSpace), fn s =>
-			let
-			   val suc = valOf (Int.fromString s)
-			   val _ =
-			      Graph.addEdge
-			      (graph,
-			       {from = from,
-				to = #node (Vector.sub (sources, suc))})
-			in
-			   ()
-			end)
-		    end)
+		   Vector.foreach2
+		   (sources, split,
+		    fn ({successorsIndex, ...}, {node = from, ...}) =>
+		    Vector.foreach
+		    (Vector.sub (sourceSeqs, successorsIndex),
+		     fn to =>
+		     (Graph.addEdge
+		      (graph, {from = from,
+			       to = #node (Vector.sub (split, to))})
+		      ; ())))
 		val _ =
 		   case line () of
 		      "" => ()
 		    | _ => Error.bug "expected end of file"
-		val rc = Regexp.compileNFA (!ignore)
-		val {get = shouldIgnore: unit Node.t -> bool, ...} =
-		   Property.get
-		   (Node.plist,
-		    Property.initFun
-		    (fn n =>
-		     Regexp.Compiled.matchesAll
-		     (rc,
-		      Source.toString
-		      (#source (Vector.sub (sources, nodeIndex n))))))
-		val (graph, {newNode, ...}) =
-		   Graph.ignoreNodes (graph, shouldIgnore)
-		val (graph, {node = coerceNode, ...}) = Graph.coerce graph
-		val sources =
-		   Vector.map (sources, fn {node, source} =>
-			       if shouldIgnore node
-				  then NONE
-			       else SOME {node = coerceNode (newNode node),
-					  source = source})
 	     in
 		T {callGraph = graph,
 		   magic = magic,
+		   master = master,
 		   name = afile,
-		   sources = sources}
+		   split = split}
 	     end)
    end
 
@@ -209,36 +228,60 @@
 structure Counts =
    struct
       datatype t =
-	 Current of IntInf.t vector
+	 Current of {master: IntInf.t vector,
+		     split: IntInf.t vector}
        | Empty
-       | Stack of {current: IntInf.t,
-		   stack: IntInf.t,
-		   stackGC: IntInf.t} vector
+       | Stack of {master: {current: IntInf.t,
+			    stack: IntInf.t,
+			    stackGC: IntInf.t} vector,
+		   split: {current: IntInf.t,
+			   stack: IntInf.t,
+			   stackGC: IntInf.t} vector}
 
       val layout =
-	 fn Current v => Vector.layout IntInf.layout v
+	 fn Current {master, split} =>
+	      Layout.record [("master", Vector.layout IntInf.layout master),
+			     ("split", Vector.layout IntInf.layout split)]
 	  | Empty => Layout.str "empty"
-	  | Stack v =>
-	       Vector.layout
-	       (fn {current, stack, stackGC} =>
-		Layout.record [("current", IntInf.layout current),
-			       ("stack", IntInf.layout stack),
-			       ("stackGC", IntInf.layout stackGC)])
-	       v
+	  | Stack {master, split} =>
+	       let
+		  fun lay v =
+		     Vector.layout
+		     (fn {current, stack, stackGC} =>
+		      Layout.record [("current", IntInf.layout current),
+				     ("stack", IntInf.layout stack),
+				     ("stackGC", IntInf.layout stackGC)])
+		     v
+	       in
+		  Layout.record [("master", lay master),
+				 ("split", lay split)]
+	       end
 
       fun merge (c: t, c': t): t =
 	 case (c, c') of
-	    (Current v, Current v') =>
-	       Current (Vector.map2 (v, v', IntInf.+))
+	    (Current {master = m, split = s},
+	     Current {master = m', split = s'}) =>
+	       let
+		  fun merge (v, v') = Vector.map2 (v, v', IntInf.+)
+	       in
+		  Current {master = merge (m, m'),
+			   split = merge (s, s')}
+	       end
 	  | (Empty, _) => c'
 	  | (_, Empty) => c
-	  | (Stack v, Stack v') =>
-	       Stack (Vector.map2
-		      (v, v', fn ({current = c, stack = s, stackGC = g},
-				  {current = c', stack = s', stackGC = g'}) =>
-		       {current = IntInf.+ (c, c'),
-			stack = IntInf.+ (s, s'),
-			stackGC = IntInf.+ (g, g')}))
+	  | (Stack {master = m, split = s}, Stack {master = m', split = s'}) =>
+	       let
+		  fun merge (v, v') =
+		     Vector.map2
+		     (v, v', fn ({current = c, stack = s, stackGC = g},
+				 {current = c', stack = s', stackGC = g'}) =>
+		      {current = IntInf.+ (c, c'),
+		       stack = IntInf.+ (s, s'),
+		       stackGC = IntInf.+ (g, g')})
+	       in
+		  Stack {master = merge (m, m'),
+			 split = merge (s, s')}
+	       end
 	  | _ =>
 	       Error.bug
 	       "cannot merge -profile-stack false with -profile-stack true"
@@ -303,14 +346,16 @@
 		case String.tokens (line (), Char.isSpace) of
 		   [total, totalGC] => (s2i total, s2i totalGC)
 		 | _ => Error.bug "invalid totals"
-	     fun getCounts (fromLine: string -> 'a): 'a vector =
+	     fun getCounts (f: string -> 'a): {master: 'a vector,
+					       split: 'a vector} =
 		let
-		   fun loop ac =
-		      case In.inputLine ins of
-			 "" => Vector.fromListRev ac
-		       | s => loop (fromLine s :: ac)
+		   fun vector () =
+		      Vector.tabulate (valOf (Int.fromString (line ())),
+				       fn _ => f (line ()))
+		   val split = vector ()
+		   val master = vector ()
 		in
-		   loop []
+		   {master = master, split = split}
 		end
 	     val counts =
 		case style of
@@ -337,8 +382,8 @@
 	  end)
    
       fun merge (T {counts = c, kind = k, magic = m, total = t, totalGC = g},
-		 T {counts = c', kind = k', magic = m', total = t', totalGC = g',
-		    ...}): t =
+		 T {counts = c', kind = k', magic = m', total = t',
+		    totalGC = g'}): t =
 	 if m <> m'
 	    then Error.bug "wrong magic number"
 	 else
@@ -439,7 +484,11 @@
 					      [Sexp.Atom x] =>
 						 (case Real.fromString x of
 						     NONE => err ()
-						   | SOME x => Atomic (f x))
+						   | SOME x =>
+							if 0.0 <= x
+							   andalso x <= 100.0
+							   then Atomic (f x)
+							else err ())
 					    | _ => err ()
 					datatype z = datatype Atomic.t
 				     in
@@ -572,33 +621,75 @@
 	 end
    end
    
-val graphPred: NodePred.t option ref = ref NONE
+val keep: NodePred.t ref = ref NodePred.All
+
+val ticksPerSecond = 100.0
 
-fun display (AFile.T {callGraph, name = aname, sources, ...},
+fun display (AFile.T {callGraph, master, name = aname, split, ...},
 	     ProfFile.T {counts, kind, total, totalGC, ...}): unit =
    let
       val {get = nodeInfo: (unit Node.t
-			    -> {keep: bool ref,
-				mayKeep: (Atomic.t -> bool) ref,
-				options: Dot.NodeOption.t list ref}), ...} =
-	 Property.get (Node.plist,
-		       Property.initFun (fn _ => {keep = ref false,
-						  mayKeep = ref (fn _ => false),
-						  options = ref []}))
-      val graph = Graph.new ()
-      val ticksPerSecond = 100.0
-      val thresh = !thresh
+			    -> {index: int,
+				keep: bool ref,
+				mayKeep: (Atomic.t -> bool) ref}),
+	   set = setNodeInfo, ...} =
+	 Property.getSetOnce (Node.plist,
+			      Property.initRaise ("info", Node.layout))
+      val _ =
+	 Vector.foreachi (split, fn (i, {node, ...}) =>
+			  setNodeInfo (node,
+				       {index = i,
+					keep = ref false,
+					mayKeep = ref (fn _ => false)}))
+      val profileStack =
+	 case counts of
+	    Counts.Current _ => false
+	  | Counts.Empty => false
+	  | Counts.Stack _ => true
       val totalReal = Real.fromIntInf (IntInf.+ (total, totalGC))
-      fun per (ticks: IntInf.t): real * string list =
+      val per: IntInf.t -> real =
+	 if Real.equals (0.0, totalReal)
+	    then fn _ => 0.0
+	 else
+	    fn ticks => 100.0 * Real.fromIntInf ticks / totalReal
+      fun doit ({master = masterCount: 'a vector,
+		 split = splitCount: 'a vector},
+		f: 'a -> {current: IntInf.t,
+			  stack: IntInf.t,
+			  stackGC: IntInf.t}) =
 	 let
-	    val rticks = Real.fromIntInf ticks
-	    val per =
-	       if Real.equals (0.0, totalReal)
-		  then 0.0
-	       else 100.0 * rticks / totalReal
-	    val row =
-	       (concat [Real.format (per, Real.Format.fix (SOME 1)),
-			"%"])
+	    val _ =
+	       Vector.foreachi
+	       (split, fn (i, {masterIndex, node, ...}) =>
+		let
+		   val {mayKeep, ...} = nodeInfo node
+		   val {isSplit, source, ...} = Vector.sub (master, masterIndex)
+		   val name = Source.toString source
+		in
+		   mayKeep :=
+		   (fn a =>
+		    let
+		       fun thresh (x: real, sel) =
+			  let
+			     val (v, i) =
+				if isSplit
+				   then (splitCount, i)
+				else (masterCount, masterIndex)
+			  in
+			     per (sel (f (Vector.sub (v, i)))) >= x
+			  end
+		       datatype z = datatype Atomic.t
+		    in
+		       case a of
+			  Name (_, rc) =>
+			     Regexp.Compiled.matchesPrefix (rc, name)
+			| Thresh x => thresh (x, #current)
+			| ThreshGC x => thresh (x, #stackGC)
+			| ThreshStack x => thresh (x, #stack)
+		    end)
+		end)
+	    fun row (ticks: IntInf.t): string list =
+	       (concat [Real.format (per ticks, Real.Format.fix (SOME 1)), "%"])
 	       :: (if !raw
 		      then
 			 [concat
@@ -609,129 +700,199 @@
 			    | Kind.Time =>
 				 ["(",
 				  Real.format
-				  (rticks / ticksPerSecond,
+				  (Real.fromIntInf ticks / ticksPerSecond,
 				   Real.Format.fix (SOME 2)),
 				  "s)"])]
 		   else [])
+	    fun info (source: Source.t, a: 'a) =
+	       let
+		  val {current, stack, stackGC} = f a
+		  val row =
+		     row current
+		     @ (if profileStack
+			   then row stack @ row stackGC
+			else [])
+		  val pc = per current
+		  val tableInfo = 
+		     if IntInf.> (current, IntInf.fromInt 0)
+			orelse IntInf.> (stack, IntInf.fromInt 0)
+			orelse IntInf.> (stackGC, IntInf.fromInt 0)
+			then SOME {per = pc,
+				   row = Source.toStringMaybeLine source :: row}
+		     else NONE
+		  val nodeOptions =
+		     [Dot.NodeOption.Shape Dot.Box,
+		      Dot.NodeOption.Label
+		      (Source.toDotLabel source
+		       @ (if IntInf.> (current, IntInf.zero)
+			     then [(concat (List.separate (row, " ")),
+				    Dot.Center)]
+			  else [])),
+		      Dot.NodeOption.Color
+		      (if !gray
+			  then DotColor.gray (100 - Real.round (per stack))
+		       else DotColor.Black)]
+	       in
+		  {nodeOptions = nodeOptions,
+		   tableInfo = tableInfo}
+	       end
+	    val masterOptions =
+	       Vector.map2
+	       (master, masterCount, fn ({source, ...}, a) =>
+		info (source, a))
+	    val splitOptions =
+	       Vector.map2
+	       (split, splitCount, fn ({masterIndex, ...}, a) =>
+		info (#source (Vector.sub (master, masterIndex)), a))
 	 in
-	    (per, row)
+	    (masterOptions, splitOptions)
 	 end
-      val profileStack =
+      val (masterInfo, splitInfo) =
 	 case counts of
-	    Counts.Current _ => false
-	  | Counts.Empty => false
-	  | Counts.Stack _ => true
-      fun doit (v, f) =
-	 Vector.mapi
-	 (v, fn (i, x) =>
-	  let
-	     val {per, perGC, perStack, row, sortPer} = f x
-	  in
-	     case Vector.sub (sources, i) of
-		NONE => NONE
-	      | SOME {node, source, ...} =>
-		   let
-		      val {mayKeep, options, ...} = nodeInfo node
-		      val _ =
-			 mayKeep :=
-			 (fn a =>
-			  let
-			     datatype z = datatype Atomic.t
-			  in
-			     case a of
-				Name (_, rc) =>
-				   Regexp.Compiled.matchesAll
-				   (rc, Source.toString source)
-			      | Thresh x => per >= x
-			      | ThreshGC x => perGC >= x
-			      | ThreshStack x => perStack >= x
-			  end)
-		      val _ = 
-			 options :=
-			 List.append
-			 ([Dot.NodeOption.Label
-			   (Source.toDotLabel source
-			    @ (if per > 0.0
-				  then [(concat (List.separate (row, " ")),
-					 Dot.Center)]
-			       else [])),
-			   Dot.NodeOption.Shape Dot.Box,
-			   if !gray
-			      then
-				 Dot.NodeOption.Color
-				 (DotColor.gray (100 - (Real.round perStack)))
-			   else Dot.NodeOption.Color DotColor.Black],
-			  !options)
-		      val showInTable =
-			 per > 0.0
-			 andalso (per >= thresh
-				  orelse (not profileStack
-					  andalso i = sourcesIndexGC))
-		   in
-		      if showInTable
-			 then SOME {sortPer = sortPer,
-				    row = Source.toString source :: row}
-		      else NONE
-		   end
-	  end)
-      val counts =
-	 case counts of
-	    Counts.Current v =>
-	       doit (v, fn z =>
-		     let
-			val (p, r) = per z
-		     in
-			{per = p, perGC = 0.0, perStack = 0.0,
-			 row = r, sortPer = p}
-		     end)
+	    Counts.Current ms =>
+	       doit (ms, fn z => {current = z,
+				  stack = IntInf.zero,
+				  stackGC = IntInf.zero})
 	  | Counts.Empty =>
-	       let
-		  val (p, r) = per IntInf.zero
-	       in
-		  doit (Vector.new (Vector.length sources, ()),
-			fn () => {per = p, perGC = 0.0, perStack = 0.0,
-				  row = r, sortPer = p})
-	       end
-	  | Counts.Stack v =>
-	       doit (v, fn {current, stack, stackGC} =>
-		     let
-			val (cp, cr) = per current
-			val (sp, sr) = per stack
-			val (gp, gr) = per stackGC
-		     in
-			{per = sp, perGC = gp, perStack = sp,
-			 row = List.concat [cr, sr, gr],
-			 sortPer = cp}
-		     end)
-      (* Display the subgraph specified by -graph. *)
-      val graphPred =
-	 case !graphPred of
-		 NONE =>
-	       let
-		  datatype z = datatype NodePred.t
-		  datatype z = datatype Atomic.t
-	       in
-		  if profileStack
-		     then Atomic (ThreshStack thresh)
-		  else PathTo (Atomic (Thresh thresh))
-	       end
-	  | SOME p => p
+	       doit ({master = Vector.new (Vector.length master, ()),
+		      split = Vector.new (Vector.length split, ())},
+		     fn () => {current = IntInf.zero,
+			       stack = IntInf.zero,
+			       stackGC = IntInf.zero})
+	  | Counts.Stack ms =>
+	       doit (ms, fn z => z)
+      val keep = !keep
       val keepNodes =
 	 NodePred.nodes
-	 (graphPred, callGraph, fn (n, a) => (! (#mayKeep (nodeInfo n))) a)
+	 (keep, callGraph, fn (n, a) => (! (#mayKeep (nodeInfo n))) a)
       val _ = Vector.foreach (keepNodes, fn n =>
 			      #keep (nodeInfo n) := true)
-      val (subgraph, {newNode, ...}) =
-	 Graph.subgraph (callGraph, ! o #keep o nodeInfo)
-      val {get = oldNode, set = setOldNode, ...} =
+      (* keep a master node if it is not split and some copy of it is kept. *)
+      val keepMaster = Array.new (Vector.length master, false)
+      val _ =
+	 Vector.foreach
+	 (split, fn {masterIndex, node, ...} =>
+	  let
+	     val {keep, ...} = nodeInfo node
+	     val {isSplit, ...} = Vector.sub (master, masterIndex)
+	  in
+	     if !keep andalso not isSplit
+		then Array.update (keepMaster, masterIndex, true)
+	     else ()
+	  end)
+      datatype keep = T
+      val keepGraph: keep Graph.t = Graph.new ()
+      val {get = nodeOptions: keep Node.t -> NodeOption.t list,
+	   set = setNodeOptions, ...} =
 	 Property.getSetOnce (Node.plist,
-			      Property.initRaise ("old node", Node.layout))
+			      Property.initRaise ("options", Node.layout))
+      val tableInfos = ref []
+      fun newNode {nodeOptions: NodeOption.t list,
+		   tableInfo} =
+	 let
+	    val _ = Option.app (tableInfo, fn z => List.push (tableInfos, z))
+	    val n = Graph.newNode keepGraph
+	    val _ = setNodeOptions (n, nodeOptions)
+	 in
+	    n
+	 end
+      val masterNodes =
+	 Vector.tabulate
+	 (Vector.length master, fn i =>
+	  if Array.sub (keepMaster, i)
+	     then SOME (newNode (Vector.sub (masterInfo, i)))
+	  else NONE)
+      val splitNodes =
+	 Vector.mapi
+	 (split, fn (i, {masterIndex, node, ...}) =>
+	  let
+	     val {keep, ...} = nodeInfo node
+	     val {isSplit, ...} = Vector.sub (master, masterIndex)
+	  in
+	     if isSplit
+		then
+		   if !keep
+		      then SOME (newNode (Vector.sub (splitInfo, i)))
+		   else NONE
+	     else Vector.sub (masterNodes, masterIndex)
+	  end)
+      val _ =
+	 Graph.foreachEdge
+	 (callGraph, fn (from, e) =>
+	  let
+	     val to = Edge.to e
+	     fun f n = Vector.sub (splitNodes, #index (nodeInfo n))
+	  in
+	     case (f from, f to) of
+		(SOME from, SOME to) =>
+		   (Graph.addEdge (keepGraph, {from = from, to = to})
+		    ; ())
+	      | _ => ()
+	  end)
+      val _ = Graph.removeDuplicateEdges keepGraph
+      val {get = edgeOptions: keep Edge.t -> EdgeOption.t list ref, ...} =
+	 Property.get (Edge.plist, Property.initFun (fn _ => ref []))
+      (* Add a dashed edge from A to B if there is path from A to B of length
+       * >= 2 going through only ignored nodes.
+       *)
+      fun newNode (n: unit Node.t): keep Node.t option =
+	 Vector.sub (splitNodes, #index (nodeInfo n))
+      fun reach (root: unit Node.t, f: keep Node.t -> unit): unit =
+	 let
+	    val {get = isKept: keep Node.t -> bool ref, ...} =
+	       Property.get (Node.plist, Property.initFun (fn _ => ref false))
+	    val {get = isSeen: unit Node.t -> bool ref, ...} =
+	       Property.get (Node.plist, Property.initFun (fn _ => ref false))
+	    fun loop n =
+	       List.foreach
+	       (Node.successors n, fn e =>
+		let
+		   val n = Edge.to e
+		   val s = isSeen n
+		in
+		   if !s
+		      then ()
+		   else
+		      let
+			 val _ = s := true
+		      in
+			 case newNode n of
+			    NONE => loop n
+			  | SOME keepN => 
+			       let
+				  val r = isKept keepN
+			       in
+				  if !r
+				     then ()
+				  else (r := true; f keepN)
+			       end
+		      end
+		end)
+	    val _ =
+	       List.foreach (Node.successors root, fn e =>
+			     let
+				val n = Edge.to e
+			     in
+				if Option.isNone (newNode n)
+				   then loop n
+				else ()
+			     end)
+	 in
+	    ()
+	 end
       val _ =
-	 Graph.foreachNode
-	 (callGraph, fn n =>
-	  if !(#keep (nodeInfo n))
-	     then setOldNode (newNode n, n)
-	  else ())
+	 Vector.foreach2
+	 (split, splitNodes, fn ({node = from, ...}, z) =>
+	  Option.app
+	  (z, fn from' =>
+	   (reach (from, fn to =>
+		   let
+		      val e = Graph.addEdge (keepGraph, {from = from', to = to})
+		      val _ = List.push (edgeOptions e,
+					 EdgeOption.Style Dot.Dotted)
+		   in
+		      ()
+		   end))))
       val title =
 	 case !title of
 	    NONE => concat [aname, " call-stack graph"]
@@ -740,18 +901,16 @@
 	 File.withOut
 	 (concat [aname, ".dot"], fn out =>
 	  Layout.output
-	  (Graph.layoutDot (subgraph,
-			    fn _ => {edgeOptions = fn _ => [],
-				     nodeOptions =
-				     fn n => ! (#options (nodeInfo (oldNode n))),
+	  (Graph.layoutDot (keepGraph,
+			    fn _ => {edgeOptions = ! o edgeOptions,
+				     nodeOptions = nodeOptions,
 				     options = [],
 				     title = title}),
 	   out))
       (* Display the table. *)
       val tableRows =
 	 QuickSort.sortVector
-	 (Vector.keepAllMap (counts, fn z => z),
-	  fn (z, z') => #sortPer z >= #sortPer z')
+	 (Vector.fromList (!tableInfos), fn (z, z') => #per z >= #per z')
       val _ = 
 	 print
 	 (concat
@@ -802,20 +961,15 @@
       open Popt
    in
       List.map
-      ([(Normal, "graph", " <pred>", "show graph nodes",
-	 SpaceString (fn s =>
-		      graphPred := SOME (NodePred.fromString s)
-		      handle e => usage (concat ["invalid -graph arg: ",
-						 Exn.toString e]))),
-	(Normal, "graph-title", " <string>", "set call-graph title",
+      ([(Normal, "graph-title", " <string>", "set call-graph title",
 	 SpaceString (fn s => title := SOME s)),
 	(Normal, "gray", " {false|true}", "gray nodes according to stack %",
 	 boolRef gray),
-	(Normal, "ignore", " <regexp>", "ignore matching functions",
+	(Normal, "keep", " <pred>", "which functions to display",
 	 SpaceString (fn s =>
-		      case Regexp.fromString s of
-			 NONE => usage (concat ["invalid -ignore regexp: ", s])
-		       | SOME (r, _) => ignore := Regexp.or [r, !ignore])),
+		      keep := NodePred.fromString s
+		      handle e => usage (concat ["invalid -keep arg: ",
+						 Exn.toString e]))),
 	(Expert, "long-name", " {true|false}",
 	 " show long names of functions",
 	 boolRef longName),
@@ -828,10 +982,15 @@
 	 boolRef raw),
 	(Normal, "show-line", " {false|true}", "show line numbers",
 	 boolRef showLine),
-	(Normal, "thresh", " {0|1|...|100}", "only show counts above threshold",
+	(Normal, "split", " <regexp>", "split matching functions",
+	 SpaceString (fn s =>
+		      case Regexp.fromString s of
+			 NONE => usage (concat ["invalid -split regexp: ", s])
+		       | SOME (r, _) => splitReg := Regexp.or [r, !splitReg])),
+	(Normal, "thresh", " [0.0,100.0]", "-keep (thresh x)",
 	 Real (fn x => if x < 0.0 orelse x > 100.0
 			 then usage "invalid -thresh"
-		      else thresh := x)),
+		      else keep := NodePred.Atomic (Atomic.Thresh x))),
 	(Normal, "tolerant", " {false|true}", "ignore broken mlmon files",
 	 boolRef tolerant)],
        fn (style, name, arg, desc, opt) =>



1.16      +3 -3      mlton/mlprof/mlprof-stubs.cm

Index: mlprof-stubs.cm
===================================================================
RCS file: /cvsroot/mlton/mlton/mlprof/mlprof-stubs.cm,v
retrieving revision 1.15
retrieving revision 1.16
diff -u -r1.15 -r1.16
--- mlprof-stubs.cm	5 Jul 2003 17:13:12 -0000	1.15
+++ mlprof-stubs.cm	7 Jul 2003 22:50:28 -0000	1.16
@@ -63,9 +63,11 @@
 ../lib/mlton/basic/exn.sml
 ../lib/mlton/basic/promise.sig
 ../lib/mlton/basic/promise.sml
-../lib/mlton/basic/instream0.sml
 ../lib/mlton/basic/relation.sig
 ../lib/mlton/basic/relation.sml
+../lib/mlton/basic/bool.sig
+../lib/mlton/basic/bool.sml
+../lib/mlton/basic/instream0.sml
 ../lib/mlton/basic/ring.sig
 ../lib/mlton/basic/ring-with-identity.sig
 ../lib/mlton/basic/stream.sig
@@ -95,8 +97,6 @@
 ../lib/mlton/basic/trace.sig
 ../lib/mlton/basic/trace.sml
 ../lib/mlton/basic/ring-with-identity.fun
-../lib/mlton/basic/bool.sig
-../lib/mlton/basic/bool.sml
 ../lib/mlton/basic/euclidean-ring.fun
 ../lib/mlton/basic/integer.fun
 ../lib/mlton/basic/int.sml



1.16      +3 -3      mlton/mlprof/mlprof.cm

Index: mlprof.cm
===================================================================
RCS file: /cvsroot/mlton/mlton/mlprof/mlprof.cm,v
retrieving revision 1.15
retrieving revision 1.16
diff -u -r1.15 -r1.16
--- mlprof.cm	23 Jun 2003 04:58:54 -0000	1.15
+++ mlprof.cm	7 Jul 2003 22:50:28 -0000	1.16
@@ -31,9 +31,11 @@
 ../lib/mlton/basic/exn.sml
 ../lib/mlton/basic/promise.sig
 ../lib/mlton/basic/promise.sml
-../lib/mlton/basic/instream0.sml
 ../lib/mlton/basic/relation.sig
 ../lib/mlton/basic/relation.sml
+../lib/mlton/basic/bool.sig
+../lib/mlton/basic/bool.sml
+../lib/mlton/basic/instream0.sml
 ../lib/mlton/basic/ring.sig
 ../lib/mlton/basic/ring-with-identity.sig
 ../lib/mlton/basic/stream.sig
@@ -63,8 +65,6 @@
 ../lib/mlton/basic/trace.sig
 ../lib/mlton/basic/trace.sml
 ../lib/mlton/basic/ring-with-identity.fun
-../lib/mlton/basic/bool.sig
-../lib/mlton/basic/bool.sml
 ../lib/mlton/basic/euclidean-ring.fun
 ../lib/mlton/basic/integer.fun
 ../lib/mlton/basic/int.sml



1.21      +1 -1      mlton/mlton/mlton-stubs-1997.cm

Index: mlton-stubs-1997.cm
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/mlton-stubs-1997.cm,v
retrieving revision 1.20
retrieving revision 1.21
diff -u -r1.20 -r1.21
--- mlton-stubs-1997.cm	5 Jul 2003 17:13:13 -0000	1.20
+++ mlton-stubs-1997.cm	7 Jul 2003 22:50:28 -0000	1.21
@@ -354,9 +354,9 @@
 backend/c-function.fun
 backend/runtime.fun
 backend/err.sml
+backend/machine.sig
 backend/profile-label.fun
 backend/machine-atoms.fun
-backend/machine.sig
 backend/machine.fun
 ../lib/mlton/basic/unique-set.sig
 ../lib/mlton/basic/unique-set.fun



1.26      +1 -1      mlton/mlton/mlton-stubs.cm

Index: mlton-stubs.cm
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/mlton-stubs.cm,v
retrieving revision 1.25
retrieving revision 1.26
diff -u -r1.25 -r1.26
--- mlton-stubs.cm	5 Jul 2003 17:13:13 -0000	1.25
+++ mlton-stubs.cm	7 Jul 2003 22:50:28 -0000	1.26
@@ -353,9 +353,9 @@
 backend/c-function.fun
 backend/runtime.fun
 backend/err.sml
+backend/machine.sig
 backend/profile-label.fun
 backend/machine-atoms.fun
-backend/machine.sig
 backend/machine.fun
 ../lib/mlton/basic/unique-set.sig
 ../lib/mlton/basic/unique-set.fun



1.69      +1 -1      mlton/mlton/mlton.cm

Index: mlton.cm
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/mlton.cm,v
retrieving revision 1.68
retrieving revision 1.69
diff -u -r1.68 -r1.69
--- mlton.cm	24 Jun 2003 20:14:21 -0000	1.68
+++ mlton.cm	7 Jul 2003 22:50:28 -0000	1.69
@@ -321,9 +321,9 @@
 backend/c-function.fun
 backend/runtime.fun
 backend/err.sml
+backend/machine.sig
 backend/profile-label.fun
 backend/machine-atoms.fun
-backend/machine.sig
 backend/machine.fun
 ../lib/mlton/basic/unique-set.sig
 ../lib/mlton/basic/unique-set.fun



1.56      +22 -42    mlton/mlton/backend/backend.fun

Index: backend.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/backend/backend.fun,v
retrieving revision 1.55
retrieving revision 1.56
diff -u -r1.55 -r1.56
--- backend.fun	23 Jun 2003 04:58:56 -0000	1.55
+++ backend.fun	7 Jul 2003 22:50:28 -0000	1.56
@@ -157,31 +157,17 @@
       val program = pass ("insertSignalChecks", SignalCheck.insert, program)
       val program = pass ("implementHandlers", ImplementHandlers.doit, program)
       val _ = R.Program.checkHandlers program
-      val {frameProfileIndices, labels = profileLabels, program, sources,
-	   sourceSeqs, sourceSuccessors} =
+      val (program, makeProfileInfo) =
 	 Control.passTypeCheck
-	 {display = Control.Layouts (fn ({program, ...}, output) =>
+	 {display = Control.Layouts (fn ((program, _), output) =>
 				     Rssa.Program.layouts (program, output)),
 	  name = "profile",
 	  style = Control.No,
 	  suffix = "rssa",
 	  thunk = fn () => Profile.profile program,
-	  typeCheck = R.Program.typeCheck o #program}
+	  typeCheck = R.Program.typeCheck o #1}
       val profile = !Control.profile <> Control.ProfileNone
       val profileStack = profile andalso !Control.profileStack
-      val frameProfileIndex =
-	 if profile
-	    then
-	       let
-		  val {get, set, ...} =
-		     Property.getSetOnce
-		     (Label.plist,
-		      Property.initRaise ("frameProfileIndex", Label.layout))
-		  val _ = Vector.foreach (frameProfileIndices, set)
-	       in
-		  get
-	       end
-	 else fn _ => 0
       val _ =
 	 let
 	    open Control
@@ -225,7 +211,7 @@
 	  end)
       (* FrameInfo. *)
       local
-	 val frameSources = ref []
+	 val frameLabels = ref []
 	 val frameLayouts = ref []
 	 val frameLayoutsCounter = Counter.new 0
 	 val _ = IntSet.reset ()
@@ -249,11 +235,11 @@
 	 fun allFrameInfo () =
 	    let
 	       (* Reverse lists because the index is from back of list. *)
-	       val frameOffsets = Vector.fromListRev (!frameOffsets)
+	       val frameLabels = Vector.fromListRev (!frameLabels)
 	       val frameLayouts = Vector.fromListRev (!frameLayouts)
-	       val frameSources = Vector.fromListRev (!frameSources)
+	       val frameOffsets = Vector.fromListRev (!frameOffsets)
 	    in
-	       (frameLayouts, frameOffsets, frameSources)
+	       (frameLabels, frameLayouts, frameOffsets)
 	    end
 	 fun getFrameLayoutsIndex {isC: bool,
 				   label: Label.t,
@@ -261,7 +247,6 @@
 				   size: int}: int =
 	    let
 	       val foi = frameOffsetsIndex (IntSet.fromList offsets)
-	       val profileIndex = frameProfileIndex label
 	       fun new () =
 		  let
 		     val _ =
@@ -269,34 +254,34 @@
 				   {frameOffsetsIndex = foi,
 				    isC = isC,
 				    size = size})
-		     val _ =
-			if profile
-			   then List.push (frameSources, profileIndex)
-			else ()
+		     val _ = List.push (frameLabels, label)
 		  in
 		     Counter.next frameLayoutsCounter
 		  end
 	    in
+	       (* We need to give each frame its own layout index in two cases.
+		* 1. If we are using the C codegen, in which case we want the
+		*    indices in a chunk to be consecutive integers so that gcc
+		*    will use a jump table.
+		* 2. If we are profiling, we want every frame to have a
+		*    different index so that it can have its own profiling info.
+		*    This will be created by the call to makeProfileInfo at the
+		*    end of the backend.
+		*)
 	       if not (!Control.Native.native)
-		  then
-		     (* Assign the entries of each chunk consecutive integers
-		      * so that gcc will use a jump table.
-		      *)
-		     new ()
+		  orelse !Control.profile <> Control.ProfileNone
+		  then new ()
 	       else
 	       #frameLayoutsIndex
 	       (HashSet.lookupOrInsert
 		(table, Word.fromInt foi,
-		 fn {frameOffsetsIndex = foi', isC = isC',
-		     profileIndex = pi', size = s', ...} =>
+		 fn {frameOffsetsIndex = foi', isC = isC', size = s', ...} =>
 		 foi = foi'
 		 andalso isC = isC'
-		 andalso profileIndex = pi'
 		 andalso size = s',
 		 fn () => {frameLayoutsIndex = new (),
 			   frameOffsetsIndex = foi,
 			   isC = isC,
-			   profileIndex = profileIndex,
 			   size = size}))
 	    end
       end
@@ -1061,7 +1046,7 @@
        *)
       val _ = List.foreach (chunks, fn M.Chunk.T {blocks, ...} =>
 			    Vector.foreach (blocks, Label.clear o M.Block.label))
-      val (frameLayouts, frameOffsets, frameSources) = allFrameInfo ()
+      val (frameLabels, frameLayouts, frameOffsets) = allFrameInfo ()
       val maxFrameSize =
 	 List.fold
 	 (chunks, 0, fn (M.Chunk.T {blocks, ...}, max) =>
@@ -1099,12 +1084,7 @@
 	      max
 	   end))
       val maxFrameSize = Runtime.wordAlignInt maxFrameSize
-      val profileInfo =
-	 ProfileInfo.T {frameSources = frameSources,
-			labels = profileLabels,
-			sourceSeqs = sourceSeqs,
-			sourceSuccessors = sourceSuccessors,
-			sources = sources}
+      val profileInfo = makeProfileInfo {frames = frameLabels}
    in
       Machine.Program.T 
       {chunks = chunks,



1.50      +76 -58    mlton/mlton/backend/machine.fun

Index: machine.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/backend/machine.fun,v
retrieving revision 1.49
retrieving revision 1.50
diff -u -r1.49 -r1.50
--- machine.fun	23 Jun 2003 04:58:56 -0000	1.49
+++ machine.fun	7 Jul 2003 22:50:28 -0000	1.50
@@ -625,15 +625,21 @@
 	 T of {frameSources: int vector,
 	       labels: {label: ProfileLabel.t,
 			sourceSeqsIndex: int} vector,
+	       names: string vector,
 	       sourceSeqs: int vector vector,
-	       sourceSuccessors: int vector,
-	       sources: SourceInfo.t vector}
+	       sources: {nameIndex: int,
+			 successorsIndex: int} vector}
+
+      val empty = T {frameSources = Vector.new0 (),
+		     labels = Vector.new0 (),
+		     names = Vector.new0 (),
+		     sourceSeqs = Vector.new0 (),
+		     sources = Vector.new0 ()}
 
       fun clear (T {labels, ...}) =
 	 Vector.foreach (labels, ProfileLabel.clear o #label)
 
-      fun layout (T {frameSources, labels, sourceSeqs, sourceSuccessors,
-		     sources}) =
+      fun layout (T {frameSources, labels, names, sourceSeqs, sources}) =
 	 Layout.record
 	 [("frameSources", Vector.layout Int.layout frameSources),
 	  ("labels",
@@ -643,44 +649,46 @@
 			   ("sourceSeqsIndex",
 			    Int.layout sourceSeqsIndex)])
 	   labels),
+	  ("names", Vector.layout String.layout names),
 	  ("sourceSeqs", Vector.layout (Vector.layout Int.layout) sourceSeqs),
-	  ("sourceSuccessors", Vector.layout Int.layout sourceSuccessors),
-	  ("sources", Vector.layout SourceInfo.layout sources)]
+	  ("sources",
+	   Vector.layout (fn {nameIndex, successorsIndex} =>
+			  Layout.record [("nameIndex", Int.layout nameIndex),
+					 ("successorsIndex",
+					  Int.layout successorsIndex)])
+	   sources)]
 
       fun layouts (pi, output) = output (layout pi)
 
-      fun isOK (T {frameSources,
-		   labels,
-		   sourceSeqs,
-		   sourceSuccessors,
-		   sources}): bool =
+      fun isOK (T {frameSources, labels, names, sourceSeqs, sources}): bool =
 	 let
+	    val namesLength = Vector.length names
 	    val sourceSeqsLength = Vector.length sourceSeqs
 	    val sourcesLength = Vector.length sources
 	 in
 	    !Control.profile = Control.ProfileNone
 	    orelse
-	    (true
-	     andalso (Vector.forall
-		      (frameSources, fn i =>
-		       0 <= i andalso i < sourceSeqsLength))
+	    (Vector.forall (frameSources, fn i =>
+			    0 <= i andalso i < sourceSeqsLength)
 	     andalso (Vector.forall
 		      (labels, fn {sourceSeqsIndex = i, ...} =>
-		       0 <= i andalso i < sourceSeqsLength)))
+		       0 <= i andalso i < sourceSeqsLength))
 	     andalso (Vector.forall
 		      (sourceSeqs, fn v =>
 		       Vector.forall
 		       (v, fn i => 0 <= i andalso i < sourcesLength)))
-	     andalso (Vector.length sourceSuccessors = Vector.length sources)
 	     andalso (Vector.forall
-		      (sourceSuccessors, fn i =>
-		       0 <= i andalso i < sourceSeqsLength))
+		      (sources, fn {nameIndex, successorsIndex} =>
+		       0 <= nameIndex
+		       andalso nameIndex < namesLength
+		       andalso 0 <= successorsIndex
+		       andalso successorsIndex < sourceSeqsLength)))
 	 end
 
-       fun modify (T {frameSources, labels, sourceSeqs, sourceSuccessors, sources}) :
-	          {newProfileLabel: ProfileLabel.t -> ProfileLabel.t,
-		   delProfileLabel: ProfileLabel.t -> unit,
-		   getProfileInfo: unit -> t} =
+       fun modify (T {frameSources, labels, names, sourceSeqs, sources})
+	  : {newProfileLabel: ProfileLabel.t -> ProfileLabel.t,
+	     delProfileLabel: ProfileLabel.t -> unit,
+	     getProfileInfo: unit -> t} =
 	  let
 	     val {get: ProfileLabel.t -> int, set, ...} =
 	        Property.getSet
@@ -711,8 +719,8 @@
 		   val pi = T {frameSources = frameSources,
 			       labels = Vector.concat
 			                [labels, Vector.fromList (!new)],
+			       names = names,
 			       sourceSeqs = sourceSeqs,
-			       sourceSuccessors = sourceSuccessors,
 			       sources = sources}
 		in
 		  Assert.assert ("newProfileInfo", fn () => isOK pi);
@@ -738,13 +746,13 @@
 				label: Label.t},
 			 maxFrameSize: int,
 			 objectTypes: ObjectType.t vector,
-			 profileInfo: ProfileInfo.t,
+			 profileInfo: ProfileInfo.t option,
 			 reals: (Global.t * RealX.t) list,
 			 strings: (Global.t * string) list}
 
       fun clear (T {chunks, profileInfo, ...}) =
 	 (List.foreach (chunks, Chunk.clear)
-	  ; ProfileInfo.clear profileInfo)
+	  ; Option.app (profileInfo, ProfileInfo.clear))
 
       fun frameSize (T {frameLayouts, ...},
 		     FrameInfo.T {frameLayoutsIndex, ...}) =
@@ -771,8 +779,9 @@
 					     ("isC", Bool.layout isC),
 					     ("size", Int.layout size)])
 		      frameLayouts)])
-	    ; output (str "\nProfileInfo:")
-	    ; ProfileInfo.layouts (profileInfo, output)
+	    ; Option.app (profileInfo, fn pi =>
+			  (output (str "\nProfileInfo:")
+			   ; ProfileInfo.layouts (pi, output)))
 	    ; output (str "\nObjectTypes:")
 	    ; Vector.foreachi (objectTypes, fn (i, ty) =>
 			       output (seq [str "pt_", Int.layout i,
@@ -822,11 +831,8 @@
 
       fun typeCheck (program as
 		     T {chunks, frameLayouts, frameOffsets, intInfs, main,
-			maxFrameSize, objectTypes,
-			profileInfo as ProfileInfo.T {frameSources,
-						      labels = profileLabels,
-						      ...},
-			reals, strings, ...}) =
+			maxFrameSize, objectTypes, profileInfo, reals, strings,
+			...}) =
 	 let
 	    val _ =
 	       if !Control.profile = Control.ProfileTime
@@ -846,26 +852,44 @@
 		       else print (concat ["missing profile info: ",
 					   Label.toString label, "\n"])))
 	       else ()
-	    val _ =
-	       Err.check
-	       ("frameSources length",
-		fn () => (Vector.length frameSources
-			  = (if !Control.profile <> Control.ProfileNone
-				then Vector.length frameLayouts
-			     else 0)),
-		fn () => ProfileInfo.layout profileInfo)
-	    val {get = profileLabelCount, ...} =
-	       Property.get
-	       (ProfileLabel.plist, Property.initFun (fn _ => ref 0))
-	    val _ =
-	       Vector.foreach (profileLabels, fn {label, ...} =>
+	    val profileLabelIsOk =
+	       case profileInfo of
+		  NONE =>
+		     if !Control.profile = Control.ProfileNone
+			then fn _ => false
+		     else Error.bug "profileInfo = NONE"
+		| SOME (pi as ProfileInfo.T {frameSources,
+					     labels = profileLabels, ...}) =>
+		     if !Control.profile = Control.ProfileNone
+			orelse (Vector.length frameSources
+				<> Vector.length frameLayouts)
+			then Error.bug "profileInfo = SOME"
+		     else
+			let
+			   val {get = profileLabelCount, ...} =
+			      Property.get
+			      (ProfileLabel.plist,
+			       Property.initFun (fn _ => ref 0))
+			   val _ =
+			      Vector.foreach
+			      (profileLabels, fn {label, ...} =>
 			       let
 				  val r = profileLabelCount label
 			       in
-				  case !r of
-				     0 => r := 1
-				   | _ => Error.bug "duplicate profile label"
+				  if 0 = !r
+				     then r := 1
+				  else Error.bug "duplicate profile label"
 			       end)
+			in
+			   fn l =>
+			   let
+			      val r = profileLabelCount l
+			   in
+			      if 1 = !r 
+				 then (r := 2; true)
+			      else false
+			   end
+			end
 	    fun getFrameInfo (FrameInfo.T {frameLayoutsIndex, ...}) =
 	       Vector.sub (frameLayouts, frameLayoutsIndex)
 	    fun boolToUnitOpt b = if b then SOME () else NONE
@@ -1219,15 +1243,9 @@
 				 end
 			end
 		   | ProfileLabel l =>
-			if !Control.profile = Control.ProfileTime
-			   then let
-				   val r = profileLabelCount l
-				in
-				   case !r of
-				      1 => (r := 2; SOME alloc)
-				    | _ => NONE
-				end
-			else SOME alloc
+			if profileLabelIsOk l
+			   then SOME alloc
+			else NONE
 	       end
 	    fun liveIsOk (live: Operand.t vector,
 			  a: Alloc.t): bool =



1.38      +5 -3      mlton/mlton/backend/machine.sig

Index: machine.sig
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/backend/machine.sig,v
retrieving revision 1.37
retrieving revision 1.38
diff -u -r1.37 -r1.38
--- machine.sig	23 Jun 2003 04:58:56 -0000	1.37
+++ machine.sig	7 Jul 2003 22:50:28 -0000	1.38
@@ -223,13 +223,15 @@
 	             frameSources: int vector,
 		     labels: {label: ProfileLabel.t,
 			      sourceSeqsIndex: int} vector,
+		     names: string vector,
 		     (* Each sourceSeq describes a sequence of source functions,
 		      * each given as an index into the source vector.
 		      *)
 		     sourceSeqs: int vector vector,
-		     sourceSuccessors: int vector,
-		     sources: SourceInfo.t vector}
+		     sources: {nameIndex: int,
+			       successorsIndex: int} vector}
 
+	    val empty: t
 	    val modify: t -> {newProfileLabel: ProfileLabel.t -> ProfileLabel.t,
 			      delProfileLabel: ProfileLabel.t -> unit,
 			      getProfileInfo: unit -> t}
@@ -253,7 +255,7 @@
 			    label: Label.t},
 		     maxFrameSize: int,
 		     objectTypes: ObjectType.t vector,
-		     profileInfo: ProfileInfo.t,
+		     profileInfo: ProfileInfo.t option,
 		     reals: (Global.t * RealX.t) list,
 		     strings: (Global.t * string) list}
 



1.28      +62 -44    mlton/mlton/backend/profile.fun

Index: profile.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/backend/profile.fun,v
retrieving revision 1.27
retrieving revision 1.28
diff -u -r1.27 -r1.28
--- profile.fun	23 Jun 2003 04:58:56 -0000	1.27
+++ profile.fun	7 Jul 2003 22:50:28 -0000	1.28
@@ -8,22 +8,22 @@
 
 structure InfoNode =
    struct
-      datatype t = T of {index: int,
-			 info: SourceInfo.t,
+      datatype t = T of {info: SourceInfo.t,
+			 nameIndex: int,
+			 sourcesIndex: int,
 			 successors: t list ref}
 
       local
 	 fun make f (T r) = f r
       in
-	 val index = make #index
 	 val info = make #info
+	 val sourcesIndex = make #sourcesIndex
       end
 
-      fun layout (T {index, info, ...}) =
-	 Layout.record [("index", Int.layout index),
-			("info", SourceInfo.layout info)]
+      fun layout (T {info, ...}) =
+	 Layout.record [("info", SourceInfo.layout info)]
 
-      fun equals (n: t, n': t): bool = index n = index n'
+      fun equals (n: t, n': t): bool = SourceInfo.equals (info n, info n')
 
       fun call {from = T {info = i, successors, ...},
 		to as T {info = i', ...}} =
@@ -77,18 +77,14 @@
       fun toSources (ps: t list): int list =
 	 List.fold (rev ps, [], fn (p, ac) =>
 		    case p of
-		       Enter (InfoNode.T {index, ...}) => index :: ac
+		       Enter (InfoNode.T {sourcesIndex, ...}) =>
+			  sourcesIndex :: ac
 		     | Skip _ => ac)
    end
 
 fun profile program =
    if !Control.profile = Control.ProfileNone
-      then {frameProfileIndices = Vector.new0 (),
-	    labels = Vector.new0 (),
-	    program = program,
-	    sourceSeqs = Vector.new0 (),
-	    sourceSuccessors = Vector.new0 (),
-	    sources = Vector.new0 ()}
+      then (program, fn _ => NONE)
    else
    let
       val Program.T {functions, main, objectTypes} = program
@@ -97,28 +93,30 @@
       val profileAlloc: bool = profile = Control.ProfileAlloc
       val profileStack: bool = !Control.profileStack
       val profileTime: bool = profile = Control.ProfileTime
-      val frameProfileIndices = ref []
+      val frameProfileIndices: (Label.t * int) list ref = ref []
       val infoNodes: InfoNode.t list ref = ref []
+      val nameCounter = Counter.new 0
+      val names: string list ref = ref []
       local
-	 val c = Counter.new 0
-	 fun new si =
+	 val sourceCounter = Counter.new 0
+	 val {get = nameIndex, ...} =
+	    Property.get (SourceInfo.plist,
+			  Property.initFun
+			  (fn si =>
+			   (List.push (names, SourceInfo.toString' (si, "\t"))
+			    ; Counter.next nameCounter)))
+      in	 
+	 fun sourceInfoNode (si: SourceInfo.t) =
 	    let
-	       val index = Counter.next c
-	       val infoNode = InfoNode.T {index = index,
-					  info = si,
-					  successors = ref []}
+	       val infoNode =
+		  InfoNode.T {info = si,
+			      nameIndex = nameIndex si,
+			      sourcesIndex = Counter.next sourceCounter,
+			      successors = ref []}
 	       val _ = List.push (infoNodes, infoNode)
 	    in
 	       infoNode
 	    end
-	 val {get = share, ...} =
-	    Property.get (SourceInfo.plist, Property.initFun new)
-	 val rc = Regexp.compileNFA (!Control.profileSplit)
-      in	 
-	 fun sourceInfoNode (si: SourceInfo.t) =
-	    if Regexp.Compiled.matchesAll (rc, SourceInfo.toString si)
-	       then new si
-	    else share si
       end
       fun firstEnter (ps: Push.t list): InfoNode.t option =
 	 List.peekMap (ps, fn p =>
@@ -173,9 +171,11 @@
 	 fun makeSourceSeqs () = Vector.fromListRev (!sourceSeqs)
       end
       (* Ensure that [SourceInfo.unknown] is index 0. *)
-      val unknownSourceSeq = sourceSeqIndex [InfoNode.index unknownInfoNode]
+      val unknownSourceSeq =
+	 sourceSeqIndex [InfoNode.sourcesIndex unknownInfoNode]
       (* Ensure that [SourceInfo.gc] is index 1. *)
-      val gcSourceSeq = sourceSeqIndex [InfoNode.index gcInfoNode]
+      val gcSourceSeq =
+	 sourceSeqIndex [InfoNode.sourcesIndex gcInfoNode]
       fun addFrameProfileIndex (label: Label.t,
 				index: int): unit =
 	 List.push (frameProfileIndices, (label, index))
@@ -348,7 +348,7 @@
 					    [] => Error.bug "missing Leave"
 					  | infoNode :: leaves =>
 					       (leaves,
-						InfoNode.index infoNode
+						InfoNode.sourcesIndex infoNode
 						:: sourceSeq))
 			    in
 			       (leaves, npl, sourceSeq, ss)
@@ -706,21 +706,39 @@
 			       main = doFunction main,
 			       objectTypes = objectTypes}
       val _ = addFuncEdges ()
-      val infoNodes = Vector.fromListRev (!infoNodes)
-      val sources = Vector.map (infoNodes, InfoNode.info)
-      val sourceSuccessors =
-	 Vector.map (infoNodes, fn InfoNode.T {successors, ...} =>
-		     sourceSeqIndex (List.revMap (!successors, InfoNode.index)))
-      (* This must happen after makeSources, since that creates new sourceSeqs.
+      val names = Vector.fromListRev (!names)
+      val sources =
+	 Vector.map
+	 (Vector.fromListRev (!infoNodes),
+	  fn InfoNode.T {nameIndex, successors, ...} =>
+	  {nameIndex = nameIndex, 
+	   successorsIndex = (sourceSeqIndex
+			      (List.revMap (!successors,
+					    InfoNode.sourcesIndex)))})
+      (* makeSourceSeqs () must happen after making sources, since that creates
+       * new sourceSeqs.
        *)
       val sourceSeqs = makeSourceSeqs ()
+      fun makeProfileInfo {frames} =
+	 let
+	    val {get, set, ...} =
+	       Property.getSetOnce
+	       (Label.plist,
+		Property.initRaise ("frameProfileIndex", Label.layout))
+	    val _ =
+	       List.foreach (!frameProfileIndices, fn (l, i) =>
+			     set (l, i))
+	    val frameSources = Vector.map (frames, get)
+	 in
+	    SOME (Machine.ProfileInfo.T
+		  {frameSources = frameSources,
+		   labels = Vector.fromList (!labels),
+		   names = names,
+		   sourceSeqs = sourceSeqs,
+		   sources = sources})
+	 end
    in
-      {frameProfileIndices = Vector.fromList (!frameProfileIndices),
-       labels = Vector.fromList (!labels),
-       program = program,
-       sourceSeqs = sourceSeqs,
-       sourceSuccessors = sourceSuccessors,
-       sources = sources}
+      (program, makeProfileInfo)
    end
 
 end



1.3       +5 -7      mlton/mlton/backend/profile.sig

Index: profile.sig
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/backend/profile.sig,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -r1.2 -r1.3
--- profile.sig	4 Jan 2003 02:00:36 -0000	1.2
+++ profile.sig	7 Jul 2003 22:50:28 -0000	1.3
@@ -3,7 +3,9 @@
    
 signature PROFILE_STRUCTS = 
    sig
+      structure Machine: MACHINE
       structure Rssa: RSSA
+      sharing Machine.ProfileLabel = Rssa.ProfileLabel
    end
 
 signature PROFILE = 
@@ -11,11 +13,7 @@
       include PROFILE_STRUCTS
       
       val profile:
-	 Rssa.Program.t -> {frameProfileIndices: (Rssa.Label.t * int) vector,
-			    labels: {label: Rssa.ProfileLabel.t,
-				     sourceSeqsIndex: int} vector,
-			    program: Rssa.Program.t,
-			    sourceSeqs: int vector vector,
-			    sourceSuccessors: int vector,
-			    sources: Rssa.SourceInfo.t vector}
+	 Rssa.Program.t
+	 -> Rssa.Program.t * ({frames: Rssa.Label.t vector}
+			      -> Machine.ProfileInfo.t option)
    end



1.60      +29 -26    mlton/mlton/codegen/c-codegen/c-codegen.fun

Index: c-codegen.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/codegen/c-codegen/c-codegen.fun,v
retrieving revision 1.59
retrieving revision 1.60
diff -u -r1.59 -r1.60
--- c-codegen.fun	5 Jul 2003 23:30:26 -0000	1.59
+++ c-codegen.fun	7 Jul 2003 22:50:29 -0000	1.60
@@ -360,33 +360,36 @@
 	 end
       fun declareProfileInfo () =
 	 let
-	    val ProfileInfo.T {frameSources, labels, sourceSeqs,
-			       sourceSuccessors, sources, ...} =
-	       profileInfo
+	    fun doit (ProfileInfo.T {frameSources, labels, names, sourceSeqs,
+				     sources}) =
+	       (Vector.foreach (labels, fn {label, ...} =>
+				declareProfileLabel (label, print))
+		; (Vector.foreachi
+		   (sourceSeqs, fn (i, v) =>
+		    (print (concat ["static int sourceSeq",
+				    Int.toString i,
+				    "[] = {"])
+		     ; print (C.int (Vector.length v))
+		     ; Vector.foreach (v, fn i =>
+				       (print (concat [",", C.int i])))
+		     ; print "};\n")))
+		; declareArray ("uint", "*sourceSeqs", sourceSeqs, fn (i, _) =>
+				concat ["sourceSeq", Int.toString i])
+		; declareArray ("uint", "frameSources", frameSources, C.int o #2)
+		; (declareArray
+		   ("struct GC_sourceLabel", "sourceLabels", labels,
+		    fn (_, {label, sourceSeqsIndex}) =>
+		    concat ["{(pointer)", ProfileLabel.toString label, ", ",
+			    C.int sourceSeqsIndex, "}"]))
+		; declareArray ("string", "sourceNames", names, C.string o #2)
+		; declareArray ("struct GC_source", "sources", sources,
+				fn (_, {nameIndex, successorsIndex}) =>
+				concat ["{ ", Int.toString nameIndex, ", ",
+					Int.toString successorsIndex, " }"]))
 	 in
-	    Vector.foreach (labels, fn {label, ...} =>
-			    declareProfileLabel (label, print))
-	    ; declareArray ("struct GC_sourceLabel", "sourceLabels", labels,
-			    fn (_, {label, sourceSeqsIndex}) =>
-			    concat ["{(pointer)", ProfileLabel.toString label,
-				    ", ", C.int sourceSeqsIndex, "}"])
-	    ; declareArray ("string", "sources", sources,
-			    fn (_, si) =>
-			    C.string (SourceInfo.toString' (si, "\t")))
-	    ; Vector.foreachi (sourceSeqs, fn (i, v) =>
-			       (print (concat ["static int sourceSeq",
-					       Int.toString i,
-					       "[] = {"])
-				; print (C.int (Vector.length v))
-				; Vector.foreach (v, fn i =>
-						  (print (concat [",", C.int i])))
-				; print "};\n"))
-				      
-	    ; declareArray ("uint", "*sourceSeqs", sourceSeqs, fn (i, _) =>
-			    concat ["sourceSeq", Int.toString i])
-	    ; declareArray ("uint", "frameSources", frameSources, C.int o #2)
-	    ; declareArray ("uint", "sourceSuccessors", sourceSuccessors,
-			    C.int o #2)
+	    case profileInfo of
+	       NONE => doit ProfileInfo.empty
+	     | SOME z => doit z
 	 end
    in
       outputIncludes (includes, print)



1.43      +5 -1      mlton/mlton/codegen/x86-codegen/x86-codegen.fun

Index: x86-codegen.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/codegen/x86-codegen/x86-codegen.fun,v
retrieving revision 1.42
retrieving revision 1.43
diff -u -r1.42 -r1.43
--- x86-codegen.fun	25 Jun 2003 03:09:58 -0000	1.42
+++ x86-codegen.fun	7 Jul 2003 22:50:29 -0000	1.43
@@ -100,6 +100,10 @@
 	val makeS = outputS
 
 	val Machine.Program.T {profileInfo, ...} = program
+	val profileInfo =
+	   case profileInfo of
+	      NONE => Machine.ProfileInfo.empty
+	    | SOME pi => pi
 	val {newProfileLabel, delProfileLabel, getProfileInfo} = 
 	  Machine.ProfileInfo.modify profileInfo
 
@@ -130,7 +134,7 @@
 		   main = main, 
 		   maxFrameSize = maxFrameSize, 
 		   objectTypes = objectTypes, 
-		   profileInfo = getProfileInfo (),
+		   profileInfo = SOME (getProfileInfo ()),
 		   reals = reals, 
 		   strings = strings} 
 	      end



1.77      +0 -2      mlton/mlton/control/control.sig

Index: control.sig
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/control/control.sig,v
retrieving revision 1.76
retrieving revision 1.77
diff -u -r1.76 -r1.77
--- control.sig	5 Jul 2003 23:30:26 -0000	1.76
+++ control.sig	7 Jul 2003 22:50:29 -0000	1.77
@@ -203,8 +203,6 @@
       datatype profileIL = ProfileSSA | ProfileSource
       val profileIL: profileIL ref
 
-      val profileSplit: Regexp.t ref
-
       val profileStack: bool ref
 
       (* Array bounds checking. *)



1.93      +0 -4      mlton/mlton/control/control.sml

Index: control.sml
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/control/control.sml,v
retrieving revision 1.92
retrieving revision 1.93
diff -u -r1.92 -r1.93
--- control.sml	5 Jul 2003 23:30:26 -0000	1.92
+++ control.sml	7 Jul 2003 22:50:29 -0000	1.93
@@ -412,10 +412,6 @@
 			 default = ProfileSource,
 			 toString = ProfileIL.toString}
 
-val profileSplit = control {name = "profile split",
-			    default = Regexp.none,
-			    toString = Regexp.toString}
-
 val profileStack = control {name = "profile stack",
 			    default = false,
 			    toString = Bool.toString}



1.142     +0 -6      mlton/mlton/main/main.sml

Index: main.sml
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/main/main.sml,v
retrieving revision 1.141
retrieving revision 1.142
diff -u -r1.141 -r1.142
--- main.sml	7 Jul 2003 15:23:44 -0000	1.141
+++ main.sml	7 Jul 2003 22:50:29 -0000	1.142
@@ -308,12 +308,6 @@
 	 case s of
 	    "source" => profileIL := ProfileSource
 	  | _ => usage (concat ["invalid -profile-il arg: ", s]))),
-       (Normal, "profile-split", " <regexp>", "split duplicates of functions",
-	SpaceString
-	(fn s =>
-	 case Regexp.fromString s of
-	    NONE => usage (concat ["invalid -profile-split regexp: ", s])
-	  | SOME (r, _) => profileSplit := Regexp.or [r, !profileSplit])),
        (Normal, "profile-stack", " {false|true}", "profile the stack",
 	boolRef profileStack),
        (Normal, "safe", " {true|false}", "bounds checking and other checks",



1.146     +135 -79   mlton/runtime/gc.c

Index: gc.c
===================================================================
RCS file: /cvsroot/mlton/mlton/runtime/gc.c,v
retrieving revision 1.145
retrieving revision 1.146
diff -u -r1.145 -r1.146
--- gc.c	3 Jul 2003 00:42:33 -0000	1.145
+++ gc.c	7 Jul 2003 22:50:29 -0000	1.146
@@ -3306,15 +3306,35 @@
 		fprintf (stderr, "done walking stack\n");
 }
 
-static inline void removeFromStack (GC_state s, GC_profile p, uint i) {
+static inline string sourceName (GC_state s, uint i) {
+	if (i < s->sourcesSize)
+		return s->sourceNames[s->sources[i].nameIndex];
+	else
+		return s->sourceNames[i - s->sourcesSize];
+}
+
+static inline GC_profileStack profileStackInfo (GC_state s, uint i) {
+	return &(s->profile->stack[i]);
+}
+
+static inline uint profileMaster (GC_state s, uint i) {
+	return s->sources[i].nameIndex + s->sourcesSize;
+}
+
+static inline void removeFromStack (GC_state s, uint i) {
+	GC_profile p;
+	GC_profileStack ps;
 	ullong totalInc;
 
-	totalInc = p->total - p->lastTotal[i];
+	p = s->profile;
+	ps = profileStackInfo (s, i);
+	totalInc = p->total - ps->lastTotal;
 	if (DEBUG_PROFILE)
-		fprintf (stderr, "removing %s from stack  totalInc = %llu\n",
-				s->sources[i], totalInc);
-	p->countStack[i] += totalInc;
-	p->countStackGC[i] += p->totalGC - p->lastTotalGC[i];
+		fprintf (stderr, "removing %s from stack  ticksInc = %llu  ticksInGCInc = %llu\n",
+				sourceName (s, i), totalInc,
+				p->totalGC - ps->lastTotalGC);
+	ps->ticks += totalInc;
+	ps->ticksInGC += p->totalGC - ps->lastTotalGC;
 }
 
 static void setProfTimer (long usec) {
@@ -3339,20 +3359,33 @@
 		setProfTimer (0);
 	s->profilingIsOn = FALSE;
 	p = s->profile;
-	p->countTop[SOURCES_INDEX_GC] = p->totalGC;
 	if (s->profileStack) {
-		for (sourceIndex = 0; sourceIndex < s->sourcesSize;
+		for (sourceIndex = 0; 
+			sourceIndex < s->sourcesSize + s->sourceNamesSize;
 			++sourceIndex) {
-			if (p->stackCount[sourceIndex] > 0) {
+			if (p->stack[sourceIndex].numOccurrences > 0) {
 				if (DEBUG_PROFILE)
 					fprintf (stderr, "done leaving %s\n", 
-							s->sources[sourceIndex]);
-				removeFromStack (s, p, sourceIndex);
+							sourceName (s, sourceIndex));
+				removeFromStack (s, sourceIndex);
 			}
 		}
 	}
 }
 
+static inline void profileEnterSource (GC_state s, uint i) {
+	GC_profile p;
+	GC_profileStack ps;
+
+	p = s->profile;
+	ps = profileStackInfo (s, i);
+	if (0 == ps->numOccurrences) {
+		ps->lastTotal = p->total;
+		ps->lastTotalGC = p->totalGC;
+	}
+	ps->numOccurrences++;
+}
+
 static void profileEnter (GC_state s, uint sourceSeqIndex) {
 	int i;
 	GC_profile p;
@@ -3369,12 +3402,9 @@
 		sourceIndex = sourceSeq[i];
 		if (DEBUG_PROFILE)
 			fprintf (stderr, "entering %s\n", 
-					s->sources[sourceIndex]);
-		if (0 == p->stackCount[sourceIndex]) {
-			p->lastTotal[sourceIndex] = p->total;
-			p->lastTotalGC[sourceIndex] = p->totalGC;
-		}
-		p->stackCount[sourceIndex]++;
+					sourceName (s, sourceIndex));
+		profileEnterSource (s, sourceIndex);
+		profileEnterSource (s, profileMaster (s, sourceIndex));
 	}
 }
 
@@ -3382,6 +3412,20 @@
 	profileEnter (s, s->frameSources[i]);
 }
 
+static inline void profileLeaveSource (GC_state s, uint i) {
+	GC_profile p;
+	GC_profileStack ps;
+
+	if (DEBUG_PROFILE)
+		fprintf (stderr, "profileLeaveSource (%u)\n", i);
+	p = s->profile;
+	ps = profileStackInfo (s, i);
+	assert (ps->numOccurrences > 0);
+	ps->numOccurrences--;
+	if (0 == ps->numOccurrences)
+		removeFromStack (s, i);
+}
+
 static void profileLeave (GC_state s, uint sourceSeqIndex) {
 	int i;
 	GC_profile p;
@@ -3398,11 +3442,9 @@
 		sourceIndex = sourceSeq[i];
 		if (DEBUG_PROFILE)
 			fprintf (stderr, "leaving %s\n",
-					s->sources[sourceIndex]);
-		assert (p->stackCount[sourceIndex] > 0);
-		p->stackCount[sourceIndex]--;
-		if (0 == p->stackCount[sourceIndex])
-			removeFromStack (s, p, sourceIndex);
+					sourceName (s, sourceIndex));
+		profileLeaveSource (s, sourceIndex);
+		profileLeaveSource (s, profileMaster (s, sourceIndex));
 	}
 }
 
@@ -3410,7 +3452,6 @@
 	uint *sourceSeq;
 	uint topSourceIndex;
 
-	assert (not s->amInGC);
 	if (DEBUG_PROFILE)
 		fprintf (stderr, "profileInc (%u, %u)\n", 
 				(uint)amount, sourceSeqIndex);
@@ -3421,8 +3462,9 @@
 		: SOURCES_INDEX_UNKNOWN;
 	if (DEBUG_PROFILE)
 		fprintf (stderr, "bumping %s by %u\n",
-				s->sources[topSourceIndex], (uint)amount);
+				sourceName (s, topSourceIndex), (uint)amount);
 	s->profile->countTop[topSourceIndex] += amount;
+	s->profile->countTop[profileMaster (s, topSourceIndex)] += amount;
 	if (s->profileStack)
 		profileEnter (s, sourceSeqIndex);
 	if (SOURCES_INDEX_GC == topSourceIndex)
@@ -3442,22 +3484,19 @@
 }
 
 void GC_profileInc (GC_state s, W32 amount) {
-	assert (not s->amInGC);
 	if (DEBUG_PROFILE)
 		fprintf (stderr, "GC_profileInc (%u)\n", (uint)amount);
-	profileInc (s, amount, topFrameSourceSeqIndex (s));
+	profileInc (s, amount, 
+			 s->amInGC
+				? SOURCE_SEQ_GC 
+				: topFrameSourceSeqIndex (s));
 }
 
 void GC_profileAllocInc (GC_state s, W32 amount) {
 	if (DEBUG_PROFILE)
 		fprintf (stderr, "GC_profileAllocInc (%u)\n", (uint)amount);
-	if (s->profilingIsOn and (PROFILE_ALLOC == s->profileKind)) {
-		if (s->amInGC) {
-			s->profile->totalGC += amount;
-			return;
-		}
+	if (s->profilingIsOn and (PROFILE_ALLOC == s->profileKind))
 		GC_profileInc (s, amount);
-	}
 }
 
 static void showProf (GC_state s) {
@@ -3465,54 +3504,59 @@
 	int j;
 
 	fprintf (stdout, "0x%08x\n", s->magic);
+	fprintf (stdout, "%u\n", s->sourceNamesSize);
+	for (i = 0; i < s->sourceNamesSize; ++i)
+		fprintf (stdout, "%s\n", s->sourceNames[i]);
 	fprintf (stdout, "%u\n", s->sourcesSize);
 	for (i = 0; i < s->sourcesSize; ++i)
-		fprintf (stdout, "%s\n", s->sources[i]);
-	for (i = 0; i < s->sourcesSize; ++i) {
+		fprintf (stdout, "%u %u\n", 
+				s->sources[i].nameIndex,
+				s->sources[i].successorsIndex);
+	fprintf (stdout, "%u\n", s->sourceSeqsSize);
+	for (i = 0; i < s->sourceSeqsSize; ++i) {
 		uint *sourceSeq;
 
-		sourceSeq = s->sourceSeqs[s->sourceSuccessors[i]];
+		sourceSeq = s->sourceSeqs[i];
 		for (j = 1; j <= sourceSeq[0]; ++j)
 			fprintf (stdout, "%u ", sourceSeq[j]);
 		fprintf (stdout, "\n");
 	}
 }
 
-void GC_profileFree (GC_state s, GC_profile p) {
-	free (p->countTop);
-	if (s->profileStack) {
-		free (p->countStack);
-		free (p->countStackGC);
-		free (p->lastTotal);
-		free (p->lastTotalGC);
-		free (p->stackCount);
-	}
-	free (p);
-}
-
 GC_profile GC_profileNew (GC_state s) {
 	GC_profile p;
+	uint size;
 
 	NEW (p);
 	p->total = 0;
 	p->totalGC = 0;
-	ARRAY (p->countTop, s->sourcesSize);
-	if (s->profileStack) {
-		ARRAY (p->countStack, s->sourcesSize);
-		ARRAY (p->countStackGC, s->sourcesSize);
-		ARRAY (p->lastTotal, s->sourcesSize);
-		ARRAY (p->lastTotalGC, s->sourcesSize);
-		ARRAY (p->stackCount, s->sourcesSize);
-	}
+	size = s->sourcesSize + s->sourceNamesSize;
+	ARRAY (p->countTop, size);
+	if (s->profileStack)
+		ARRAY (p->stack, size);
 	if (DEBUG_PROFILE)
 		fprintf (stderr, "0x%08x = GC_profileNew ()\n", (uint)p);
 	return p;
 }
 
+void GC_profileFree (GC_state s, GC_profile p) {
+	free (p->countTop);
+	if (s->profileStack)
+		free (p->stack);
+	free (p);
+}
+
 static void writeString (int fd, string s) {
 	swrite (fd, s, strlen(s));
 }
 
+static void writeUint (int fd, uint u) {
+	char buf[20];
+
+	sprintf (buf, "%u", u);
+	writeString (fd, buf);
+}
+
 static void writeUllong (int fd, ullong u) {
 	char buf[20];
 
@@ -3531,6 +3575,20 @@
 	writeString (fd, "\n");
 }
 
+static void profileWriteCount (GC_state s, GC_profile p, int fd, uint i) {
+	writeUllong (fd, p->countTop[i]);
+	if (s->profileStack) {
+		GC_profileStack ps;
+	
+		ps = &(p->stack[i]);
+		writeString (fd, " ");
+		writeUllong (fd, ps->ticks);
+		writeString (fd, " ");
+		writeUllong (fd, ps->ticksInGC);
+	}
+	newline (fd);
+}
+
 void GC_profileWrite (GC_state s, GC_profile p, int fd) {
 	int i;
 
@@ -3547,16 +3605,14 @@
 	writeString (fd, " ");
 	writeUllong (fd, p->totalGC);
 	newline (fd);
-	for (i = 0; i < s->sourcesSize; ++i) {
-		writeUllong (fd, p->countTop[i]);
-		if (s->profileStack) {
-			writeString (fd, " ");
-			writeUllong (fd, p->countStack[i]);
-			writeString (fd, " ");
-			writeUllong (fd, p->countStackGC[i]);
-		}
-		newline (fd);
-	}
+	writeUint (fd, s->sourcesSize);
+	newline (fd);
+	for (i = 0; i < s->sourcesSize; ++i)
+		profileWriteCount (s, p, fd, i);
+	writeUint (fd, s->sourceNamesSize);
+	newline (fd);
+	for (i = 0; i < s->sourceNamesSize; ++i)
+		profileWriteCount (s, p, fd, i + s->sourcesSize);
 }
 
 #if (defined (__linux__) || defined (__FreeBSD__) || defined (__sun__))
@@ -3588,20 +3644,20 @@
 #endif
 	if (DEBUG_PROFILE)
 		fprintf (stderr, "catcher  pc = 0x%08x\n", (uint)pc);
-	if (s->amInGC) {
-		s->profile->totalGC++;
-		return;
-	}
-	frameIndex = topFrameIndex (s);
-	if (s->frameLayouts[frameIndex].isC) {
-		sourceSeqIndex = s->frameSources[frameIndex];
-	} else {
-		if (s->textStart <= pc and pc < s->textEnd) {
-			sourceSeqIndex = s->textSources [pc - s->textStart];
-		} else {
-			if (DEBUG_PROFILE)
-				fprintf (stderr, "pc out of bounds\n");
-		       	sourceSeqIndex = SOURCE_SEQ_UNKNOWN;
+	if (s->amInGC)
+		sourceSeqIndex = SOURCE_SEQ_GC;
+	else {
+		frameIndex = topFrameIndex (s);
+		if (s->frameLayouts[frameIndex].isC)
+			sourceSeqIndex = s->frameSources[frameIndex];
+		else {
+			if (s->textStart <= pc and pc < s->textEnd)
+				sourceSeqIndex = s->textSources [pc - s->textStart];
+			else {
+				if (DEBUG_PROFILE)
+					fprintf (stderr, "pc out of bounds\n");
+		       		sourceSeqIndex = SOURCE_SEQ_UNKNOWN;
+			}
 		}
 	}
 	profileInc (s, 1, sourceSeqIndex);



1.64      +42 -35    mlton/runtime/gc.h

Index: gc.h
===================================================================
RCS file: /cvsroot/mlton/mlton/runtime/gc.h,v
retrieving revision 1.63
retrieving revision 1.64
diff -u -r1.63 -r1.64
--- gc.h	14 May 2003 02:50:12 -0000	1.63
+++ gc.h	7 Jul 2003 22:50:29 -0000	1.64
@@ -223,48 +223,54 @@
 	PROFILE_TIME,
 } ProfileKind;
 
+typedef struct GC_source {
+	uint nameIndex;
+	uint successorsIndex;
+} *GC_source;
+
 typedef struct GC_sourceLabel {
 	pointer label;
 	uint sourceSeqsIndex;
 } *GC_profileLabel;
 
+/* If profileStack, then there is one struct GC_profileStackInfo for each
+ * function.
+ */
+typedef struct GC_profileStack {
+	/* ticks counts ticks while the function was on the stack. */
+	ullong ticks;
+	/* ticksInGC counts ticks in GC while the function was on the stack. */
+	ullong ticksInGC; 
+        /* lastTotal is the value of total when the oldest occurrence of f on the
+         * stack was pushed, i.e., the most recent time that numTimesOnStack
+         * changed from 0 to 1.  lastTotal is used to compute the amount to
+         * attribute to f when the oldest occurrence is finally popped.
+         */
+	ullong lastTotal;
+	/* lastTotalGC is like lastTotal, but for GC ticks. */
+	ullong lastTotalGC;
+	/* numOccurrences is the number of times this function is on the stack.
+         */
+	uint numOccurrences;
+} *GC_profileStack;
+
 /* GC_profile is used for both time and allocation profiling.
  * In the comments below, "ticks" mean clock ticks with time profiling and
  * bytes allocated with allocation profiling.
+ *
+ * All of the arrays in GC_profile are of length sourcesSize + sourceNamesSize.
+ * The first soruceSizes entries are for handling the duplicate copies of 
+ * functions, and the next sourceNamesSize entries are for the master versions.
  */
 typedef struct GC_profile {
-	/* countTop is an array of length sourcesSize that counts for each 
-	 * function the number of ticks that occurred while the function was on
-	 * top of the stack.
+	/* countTop is an array that counts for each function the number of ticks
+         * that occurred while the function was on top of the stack.
 	 */
 	ullong *countTop;
-	/* countStack is an array of length sourcesSize that counts for each
-	 * function the ticks while the function was anywhere on the stack 
-	 * (only once, no matter how many times on the stack).  countStack is 
+	/* stack is an array that gives stack info for each function.  It is
 	 * only used if profileStack.
-	 */
-	ullong *countStack;
-	/* countStackGC is an array of length sourcesSize that counts for each
-	 * function the ticks in GC while the function was anywhere on the stack
-	 * (only once, no matter how many times on the stack).  countStackGC is
-	 * only used if profileStack.
-	 */
-	ullong *countStackGC;
-	/* lastTotal is an array of length sourcesSize that for each function, 
-	 * f, stores the value of total when the oldest occurrence of f on the
-         * stack was pushed, i.e., the most recent time that stackCount[f] was 
-	 * changed from 0 to 1.  lastTotal is used to compute the amount to
-	 * attribute to f when the oldest occurrence is finally popped.
-	 * lastTotal is only used if profileStack.
-	 */
-	ullong *lastTotal;
-	/* lastTotalGC is like lastTotal, but for totalGC. */
-	ullong *lastTotalGC;
-	/* stackCount is an array of length sourcesSize that counts the number 
-	 * of times each function is on the stack.  It is only used if 
-	 * profileStack.
-	 */
- 	uint *stackCount;
+         */
+	struct GC_profileStack *stack;
 	/* The total number of mutator ticks. */
 	ullong total;
 	/* The total number of GC ticks. */
@@ -444,18 +450,19 @@
 	struct GC_sourceLabel *sourceLabels;
 	uint sourceLabelsSize;
 	/* sources is an array of strings identifying source positions. */
-	string *sources;
-	uint sourcesSize;
+	string *sourceNames;
+	uint sourceNamesSize;
 	/* Each entry in sourceSeqs is a vector, whose first element is
          * a length, and subsequent elements index into sources.
 	 */
 	uint **sourceSeqs;
 	uint sourceSeqsSize;
-	/* sourceSuccessors is an array of length sourcesSize.  Each entry is an
-	 * index into sourceSeqs that specifies the call-stack successors to this
-	 * source.
+	/* sources is an array of length sourcesSize.  Each entry specifies 
+         * an index into sourceNames and an index into sourceSeqs, giving the
+	 * name of the function and the successors, respectively.
 	 */
-	uint *sourceSuccessors;
+	struct GC_source *sources;
+	uint sourcesSize;
 	pointer stackBottom; /* The bottom of the stack in the current thread. */
  	uint startTime; /* The time when GC_init or GC_loadWorld was called. */
 	struct GC_stringInit *stringInits;



1.10      +3 -1      mlton/runtime/basis/MLton/profile.c

Index: profile.c
===================================================================
RCS file: /cvsroot/mlton/mlton/runtime/basis/MLton/profile.c,v
retrieving revision 1.9
retrieving revision 1.10
diff -u -r1.9 -r1.10
--- profile.c	8 Jan 2003 15:19:17 -0000	1.9
+++ profile.c	7 Jul 2003 22:50:29 -0000	1.10
@@ -19,6 +19,8 @@
 }
 
 void MLton_Profile_Data_write (Pointer p, Word fd) {
+	if (DEBUG_PROFILE)
+		fprintf (stderr, "MLton_Profile_Data_write (0x%08x)\n", (uint)p);
 	GC_profileWrite (&gcState, (GC_profile)p, (int)fd);
 }
 
@@ -29,7 +31,7 @@
 	s = &gcState;
 	res = (Pointer)s->profile;
 	if (DEBUG_PROFILE)
-		fprintf (stderr, "0x%0x8 = MLton_Profile_current ()\n", 
+		fprintf (stderr, "0x%08x = MLton_Profile_current ()\n", 
 				(uint)res);
 	return res;
 }





-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100006ave/direct;at.asp_061203_01/01
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel