[MLton] Bug in IntInf implementation

Sun Dec 9 09:57:34 PST 2007

On Sat, 8 Dec 2007, Wesley W. Terpstra wrote:
> On Dec 7, 2007, at 10:30 PM, Matthew Fluet wrote:
>> initIntInfRes has every reason to believe that the current heap can contain 
>> the result of the operation.  (Indeed, there are numerous 'assert's to that 
>> effect in the code.)  All of the IntInf primitives (both internally as 
>> manipulated by the compiler and externally as implemented in 
>> runtime/gc/int-inf.c) take a size_t argument which, at runtime, will be the 
>> maximum size of the resulting IntInf.  The limit check insertion pass 
>> integrates these arguments into the limit checks to ensure that there is 
>> always sufficient space in the heap to contain the result.
>
> My mistake! I was looking through the runtime and didn't see any check to 
> grow the heap. So, intInfBinary in ssa-to-rssa.fun introduces a C function 
> call (complete with the size parameter).

Right, and the C functions are annoted with 'bytesNeeded = SOME 2' to 
indicate that the 3rd argument is a word value corresponding to the 
(dynamically calculated) number of heap bytes needed by the C function.

> The predicted size comes from the 
> basis:
>>          val bigAdd = make (I.+!, Prim.+, S.max, 1)
>>          val bigSub = make (I.-!, Prim.-, S.max, 1)
>>          val bigMul = make (I.*!, Prim.*, S.+, 0)
> This is all way earlier than I expected.

It follows our general methodology of putting as much logic into the SML 
side of things as possible.

> I'm unable to find where the third argument to the IntInf primitives is used 
> in a limit check. I trust that it happens, though.

The limit check insertion pass runs after the translation from SSA to 
RSSA, so the IntInf primitives have been rewritten to C function calls. 
The 'bytesAllocated' function (limit-check.fun:115) determines whether a C 
function has a 'bytesNeeded' annotation, and folds the amount into the 
bytes allocated computation for a basic block.

> It still might not be a bad idea to hook the gmp memory management methods to 
> an assert failure, in case the predicted size is wrong.

I've got a little patch that wraps the default gmp memory management 
methods with debugging 'sprintf(stderr,...)'s.  Unfortunately(?), even 
when the result destination already has enough space, the GMP routines 
might allocate and free temporary values (particularly with 
IntInf_toString), so it isn't feasible to replace the GMP memory 
management routines with an assert failure.  It might be possible to 
assert if a free or realloc function is called on a pointer into the ML 
heap -- that would be cause for an assertion failure.