[MLton] Re: [MLton-user] How to write performant network code

Matthew Fluet fluet at tti-c.org
Thu Jan 15 20:43:12 PST 2009


(moved from mlton-user)

On Wed, 14 Jan 2009, Wesley W. Terpstra wrote:
> On Mon, Jan 12, 2009 at 5:13 AM, Matthew Fluet <fluet at tti-c.org> wrote:
>> Does memcpy (or memmove, since the *Array{,Slice}.copy functions needs to
>> work with potentially overlapping regions) do anything more than a
>> word-by-word copy?
>
> Yes. memcpy is usually hand-crafted and extremely fast assembler. It
> uses SSE and other tricks. Is it safe to also modify Word8Array.vector
> to use memcpy?

You would want to modify the implementation of Word8Array.vector to create 
an uninitialized array, memcpy into the new array, and then cast from 
array to vector.  So, yes, that would be safe.

> What about polymorphic Array.vector?

That gets a bit trickier.  You want to be careful about using memcpy on 
polymorphic arrays.  The issue is that it constrains the to and from 
arrays to be (permanently) of the same type.  For example, if you have two 
"(int * bool) array"s and copy from the first into the second, but the 
second never uses the bool component, then under the element-by-element 
copy, MLton could drop the bool component of the second array and 
compensate during the element-by-element copy by only writing the int 
component.  But, if you require a memcpy, then the src and dst need to be 
of exactly the same type.

This applies as well to the Word8Array case, but it seems less likely that 
you copy from a Word8Array.array to a Word8Array.array and never use the 
destination Word8Array.array.  On the other hand, with a polymorphic array 
instantiated with an abstract type, there seems to be a lot more 
opportunities for pruning unused components.  So, I would limit it to 
Word<N>Array{,Slice} for now.

Another difficulty with polymorphic arrays is that it isn't until late in 
the compile time that you know the size of the array elements.  The 
memcpy needs that information to know how much to copy.

BTW, since we don't support interior pointers, the copy needs to have 
types like:

   Word8Array_copy : (Word8.t array (* src *)
                      * SeqIndex.t (* src offset *)
                      * Word8.t array (* dst *)
                      * SeqIndex.t (* dst offset *)
                      * SeqIndex.t (* count *)) -> unit

   Word8Vector_copy : (Word8.t vector (* src *)
                       * SeqIndex.t (* src offset *)
                       * Word8.t array (* dst *)
                       * SeqIndex.t (* dst offset *)
                       * SeqIndex.t (* count *)) -> unit

It might be worth adding these as primitives, though it isn't clear that 
we can optimize much with regards to them.  If, for instance, the 
destination array is never read from, then we could drop the copy.  But, 
that seems unlikely to arise in realistic code.

>> A while ago, I added a primitive (structural) polymorphic hash:
>>  http://mlton.org/cgi-bin/viewsvn.cgi?view=rev&rev=6352
>> It would seem to suit your purposes: you can use it to hash any value,
>> including datatypes.
>
> This is very nice and I didn't know about it. Unfortunately, it's not
> enough because I need a universal hash function (one that takes a
> 'seed' with the value to hash).

For a given program, MLton.hash is a function (that is, it always returns 
the same hash value for structurally equivalent inputs).  So, why can't 
you take the result of MLton.hash and munge it with your 'seed'?  Or, 
better, you can always use (fn x => MLton.hash (seed, x)), so that you 
hash you seed along with the structure of interest.

> Also, one still needs to be able to
> serialize a network address out to the network in some situations. (eg
> to say: send reply message to X, not me)

Fair enough.  Though, in that situation, isn't it better to go through the 
Basis Library functions?  Blast writing a struct sockaddr to the network 
might not be blast read by another arch/os unless you guarantee that the 
sizes, alignment, padding are all the same.



More information about the MLton mailing list