[MLton-devel] nucleic benchmark times

Matthew Fluet fluet@CS.Cornell.EDU
Fri, 8 Nov 2002 08:04:40 -0500 (EST)


> Aren't there any fixed memory locations used as double `registers'?  There
> certainly used to be in the C generated code and I would think that it would
> still be used in the native back end.

Yes, there are the floating-point pseudo-registers.  These are the (R)SSA
variables that are not live across a non-tail call or a runtime call.
Since they are not live across SSA calls, the set of pseudo-registers are
global and used by all SSA functions.  (I believe that in the C codegen,
each chunk/C-function allocated it's own pseudo-registers; As we mentioned
in the previous posts, gcc is less likely to do the inter-procedural
analysis that ensures that a global set of pseudo-registers don't
interfer with uses in each SSA function; so, it makes sense to allocate
them on a per-function level where intra-procedural analysis will
(hopefully) infer that the pseudo-registers can really be used as
temporaries.)

The place where alignment may differ between the C and native backends has
to do with spills.  In theory, the allocation behaviour of both backends
is the same (modulo the initial heap location), so neither is doing
alignment of allocated float values.  On the other hand, the native
backend always spills a value to its "home" location -- that is, to it's
final destination either on the heap, on the stack, or in a
pseudo-register.  Of these locations, only the pseudo-regs are
guaranteed to be double-word aligned.  gcc will probably spill to the
C-stack, where it may be able to ensure double-word alignment of floating
point values.  This may not arise that often in practice, but imagine the
following:

SD(7) = ...
RD(3) = ... SD(7) ...
RD(4) = ... RD(3) ...
... (no updates to any location that may alias (by gcc's analysis) SD(7)) ...
RD(12) = ... RD(11) ...
RD(13) = ... SD(7) ...

If gcc needs to spill between RD(4) and RD(12), then it may choose to
spill and fetch the SD(7) value from the C-stack spill location, which may
be double word aligned, rather than refetching from SD(7), which may not
be double word aligned.  I have no idea if gcc would make this sort of
analyis or whether this makes a difference in final running times, but its
a place where the register allocation/memory usage of the two codegens may
differ.




-------------------------------------------------------
This sf.net email is sponsored by: See the NEW Palm 
Tungsten T handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel