profBug

Matthew Fluet mfluet@intertrust.com
Sat, 2 Jun 2001 18:04:35 -0700 (PDT)


> Sorry for the confusion, but I don't think that this explains any thing.  You
> are right that I can't use the full mlton -p compilation because the mlton  I
> have  then  runs  gcc with the -p flag, which is wrong.  What I usually do is
> compile with -p, then re-link it without the -p and manually  adding  prof.o.
> I must have forgotten to do this in the version I packed up.

I agree that this won't explain the GC_foreach* discrepencies, but you
have an old enough version of mlton that is confusing mlprof.  Looking at
go.0.S, you have the following:

.local MLtonProfile48$$0.main_0$$1.L_616$$1.L_86$$Begin
MLtonProfile48$$0.main_0$$1.L_616$$1.L_86$$Begin:
.p2align 2
.global L_616
.long 3
L_616:

Only real issue is that the MLtonProfile label is separated from the
"real" label by that .long 3 (which is just a frameLayout index).  That
means that the two labels have different addresses, and mlprof isn't
combining them into one and attributing time in L_616 to the main_0 cps
function.  This means that you're going to get confusing output from
mlprof -- in terms of what cps functions are being assigned time.

> I just put up a new version of the whole thing at
>     http://sourcelight.com/Z/profBug.tgz
> In the latest version, if you run
>     mlprof go mlmon.out
> you will see that it claims 2.51% for GC_foreachPointerInObject and 1.73% for
> GC_foreachPointerInRange.  If you then do `nm -na go', you will see thatthere
> are no static symbols adjacent to these two, so I claim that
>     mlprof -s go mlmon.out
> should  attribute exactly the same percentage to these two.  Despite this, if
> you run
>     mlprof -s go mlmon.out
> then neither of these routines appear at all.
>
> Also,  looking  at  the   individual   bins,   I   cound   129   counts   for
> GC_foreachPointerInObject  and  90  for GC_foreachPointerInRange all out of a
> total of 5217 counts.  Thus I would say
>     GC_foreachPointerInObject       2.47%
>     GC_foreachPointerInRange        1.73%
> so even without -s the count  for  GC_foreachPoitnerInObject  looks  slightly
> wrong.

Excuse the cut-and-paste look, but here's what mlprof is recording:

 804D524 [{GC_foreachGlobal (C)[]}],

 804D598 [{GC_foreachPointerInObject (C)[]}],

 804D598 2,
 804D599 6,
 804D59F 3,
 804D5A1 3,
 804D5A4 8,
 804D5A7 2,
 804D5A9 1,
 804D5AB 5,
 804D5B5 3,
 804D5B7 2,
 804D5C0 6,
 804D5C2 8,
 804D6E0 13,
 804D6E3 33,
 804D6E5 1,
 804D6E6 5,
 804D6EA 2,
 804D6EF 10,
 804D6F7 6,
 804D72D 6,
 804D733 2,
 804D734 2,
 804D736 2,

 131 ticks

 804D738 [{GC_foreachPointerInRange (C)[]}],

 804D750 2,
 804D752 59,
 804D75E 3,
 804D766 2,
 804D767 2,
 804D768 8,
 804D769 3,
 804D76E 6,
 804D773 5,

 90 ticks

 804D780 [{GC_initCounters (C)[]}],

So, I get 131 ticks for GC_foreachPointerInObject and 90 ticks for
GC_foreachPointerInRange, which works out to 2.51% and 1.73% respectively.

Now, this is based on the fact that I'm putting the 2 ticks in the bin at
804D598 with GC_foreachPointerInObject, which I think is the appropriate
interpretation.