[MLton-user] Tracking down allocations

Matthew Fluet fluet at tti-c.org
Wed Jul 25 15:52:44 PDT 2007


> Thanks for taking a look. My simple fix is to use the manually
> lambda-lifted version in practice, although naturally I'd be happier
> if in the future there was no price for the prettier nested function
> version.

Absolutely, it would be much better for the nested function version to 
yield the same allocation behavior as the lambda lifted version.  One 
thing that isn't immediately clear to me is why the closures yielded by 
the curried lambda lifted version are more amenable to optimization than 
the closures yielded by the nested function version.  That's a question 
independent of whether there is a better closure representation.

> On 7/24/07, Matthew Fluet <fluet at tti-c.org> wrote:
>>  On Fri, 20 Jul 2007, Joe Hurd wrote:
>> >  Sorry for the delay in replying: I was at a conference at the
>> >  beginning of the week.
>> > 
>> > >   I'd still be interested in seeing the source program, to see if there 
>> > >   is a
>> > >   clear reason why the tuple is being eliminated in one version but not 
>> > >   the
>> > >   other.
>> > 
>> >  I've uploaded the two versions of the program to the TemporaryUpload
>> >  page of the MLton website, together with the results of profiling
>> >  allocations. There are two functions that produce different results
>> >  when manually lambda lifted: they can be found in the source by
>> >  searching for the string MLTON DEBUGGING.
>>
>>  I took a look at the nested-functions.sml program.  In addition to the
>>  allocation Joe noted in the nested-functions.ssa file:
>>
>>       Enter MonteCarlo.placeStone.<branch> src/MonteCarlo.sml: 773
>>       tuple_322 = (x_5086, x_6058, x_24097)
>>
>>  there are a bunch more which get charged to the
>>  MonteCarlo.placeStone.<branch> source position:
>>
>>       tuple_348 = (x_6058, x_25012, x_24112, x_6060)
>>       tuple_347 = (x_25012, x_6060, x_6058)
>>       tuple_346 = (x_6058, x_6060, tuple_347)
>>       tuple_345 = (tuple_346, x_6060, x_6058)
>>       tuple_344 = (x_5085, tuple_348)
>>       tuple_343 = (x_6058, x_5085, tuple_322, tuple_348)
>>       tuple_342 = (x_6058, x_5085, tuple_322, tuple_348, x_6060)
>>       tuple_340 = (tuple_347, x_5085, x_6058)
>>       tuple_339 = (tuple_347, tuple_322, x_5085, x_6058)
>>       tuple_338 = (x_6058, x_5085, tuple_322, tuple_347)
>>       tuple_337 = (x_6058, x_5085, tuple_346)
>>       tuple_336 = (x_6058, x_5085, tuple_322, tuple_346)
>>       tuple_335 = (tuple_345, x_5085, x_6058)
>>       tuple_334 = (tuple_343, tuple_342)
>>       tuple_333 = (tuple_342, tuple_342, tuple_343)
>>       tuple_332 = (tuple_338, tuple_339)
>>       tuple_330 = (tuple_340, tuple_337)
>>       tuple_329 = (tuple_336, tuple_339)
>>       tuple_327 = (tuple_335, tuple_337, tuple_340)
>>
>>  These are all of the environments for the nested functions.  I believe
>>  that you are being hurt by MLton's flat-closure representation.  In
>>  particular, all of those nested functions, which make various calls to one
>>  another, share a very common environment -- the one that you manually
>>  created by lambda lifting.  You can observe this in the above by noting
>>  that many of those tuples share common values (the x_* variables); indeed,
>>  expanding tuple_345, it looks like:
>>
>>     tuple_345 = ((x_6058, x_6060, (x_25012, x_6060, x_6058)), x_6060,
>>     x_6058)
>>
>>  Note that this duplicates x_6058 and x_6060 numerous times.
>>
>>  In the lambda-lifted.sml case, the sharing seems to be exposed enough for
>>  MLton to eliminate it.  What's interesting is that you lambda-lifted in a
>>  curried way.  While MLton's closure conversion would introduce tuples for
>>  the environments of the implicit functions in the lambda-lifted ones, it
>>  seems that the packing and unpacking of the environments exposes things
>>  enough for MLton to eliminate it.
>>
>>  There is an old idea of Stephen's for an improved closure representation,
>>  whose e-mail thread was duplicated a few years ago:
>>     http://mlton.org/pipermail/mlton/2003-October/024570.html
>>  I think it is trying to pick up exactly this case, where some sharing of
>>  closures would be a win.
>>
>>  There is also the Shao/Appel closure representation analysis:
>>     http://mlton.org/References#ShaoAppel94
>>
>>  In any case, I'm afraid that there isn't an immediate fix (though
>>  Stephen's improved closure representation shouldn't be hard to implement).
>>  I like the example, though, because I observe a 1.10X improvement in moves
>>  per second using the lambda-lifted version, suggesting that there is some
>>  real improvement to be had from an improved closure representation.
>> 
>
> _______________________________________________
> MLton-user mailing list
> MLton-user at mlton.org
> http://mlton.org/mailman/listinfo/mlton-user
>



More information about the MLton-user mailing list