[MLton] latest MLton segfault in gmp

Wesley W. Terpstra wesley at terpstra.ca
Wed Oct 14 15:30:33 PDT 2009


On Wed, Oct 14, 2009 at 11:56 PM, Matthew Fluet <matthew.fluet at gmail.com>wrote:

> I'm hardly an expert.  I used the www.x86-64.org document to implement
> the C calling convention in the native codegen, but didn't peruse it
> much otherwise.
>

Nice link, thanks.


> Searching for "align" in the document, though, reveals that on page
> 12, it declares that  {,signed,unsigned} {,long} long  all have 8-byte
> alignment.


Ok, that table is pretty clear. The ABI defines that Word64s must be 8-byte
aligned. Therefore gcc was within it's rights to assume that the pointer was
8-byte aligned and the bug was ours.


>  However, on the next page it states:


>  Like the Intel386 architecture, the AMD64 architecture in general
> does not re-
>  quire all data accesses to be properly aligned. Misaligned data
> accesses are slower
>  than aligned accesses but otherwise behave identically. The only
> exceptions are
>  that __m128 and __m256 must always be aligned properly.
>

This is not a contradiction. Architecture != ABI. The machine can do it, but
the ABI forbids it.

So, it isn't clear to me that one really needs to 8-align 64-bit integers.
>

If we want to link with any other application code ... libc, libgmp, ffi,
.... then it's 100% clear we need to do 8-byte alignment. We have just been
lucky that no other software actually made use of the 8-byte alignment
guarantee until now (since few architectural limitations actually trip over
an ABI violation).


> In the next subsection (p. 13), on aggregates and unions, it states:
>
>  An array uses the same alignment as its elements, except that a
> local or global array variable of length at least 16 bytes or a C99
> variable-length
> array variable always has alignment of at least 16 bytes.
>

I think by global/local arrays they mean arrays not in the heap but the data
segment. (local = static int64_t foo[4];, global = extern int64_t foo[4];)

At any rate, this sounds like we don't need to worry because MLton only
passes arrays as pointers (both FFI and GMP limb structure).

I did want to also point out that there is a legacy issue, I would
> assume, on Debian.  Since mlton-20070826 is dynamically linked against
> libgmp, isn't it just an incredible luck of the draw that a
> self-compile with mlton-20070826 didn't happen to produce a
> non-16-byte aligned IntInf array.
>

Yes, I was surprised too. However there are a couple reasons this worked
out. First, the only code gcc managed to vectorize in the gmp C is the
MPN_ZERO method. Second, the only place MPN_ZERO gets called (for us) is to
clear the low bits of a left-shifted intinf. Third, it won't use 16-byte
writes unless there are 16-bytes to write, so it had to be a >=128-bit left
shift. I wonder if these maybe didn't happen in 20070826?

I imagine that as gcc gets smarter, vectorizing more code, this will become
a more serious legacy issue.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20091015/20b367c3/attachment.html


More information about the MLton mailing list