[MLton] Patch ready + change list for basis library

Wesley W. Terpstra wesley at terpstra.ca
Fri Feb 9 16:48:57 PST 2007


Attached is a patch implementing Wide* for MLton. I've not yet  
written WideTextIO / WidePrimIO.

I made/assumed these changes to the basis definition:
1. CHAR.ord says "returns the (non-negative) integer code point of  
the character c in Unicode."
2. CHAR now says "The Char structure provides characters taken from  
the ISO-8859-1 repertoire and locale-independent operations on them"
3. In CHAR delete the sentence "In WideChar, the functions toLower,  
toLower, isAlpha,..., isUpper and, in general, the definition of a  
``letter'' are locale-dependent."
4. Overview: added WideTextIO :> TEXT_IO -- it was missing despite  
the TEXT_IO signature requiring it

I think WideTextIO is a bit pointless. I can't imagine that someone  
really wants to write 4-byte characters out in host-specific endian  
order. Whatever. I'll implement it after this patch is commited for  
completeness.

In preparing this patch I've discovered two other bugs, both  
demonstrated by this program:
> val y : WideString.string = WideString.str (WideChar.chr 88)
> val x : WideString.string = WideString.^ ("\u5322\u1243", y)
> val s = WideString.toString x
> val () = print (s ^ "\n")
>
> val bug : WideChar.char vector = y
> val bug : Char.char vector = "asfasf"

The first bug is that on a powerpc, the output reads:
> \U22530000\U43120000X
... so the \u and \U parsing in MLton is endian backwards. I'm not  
sure where this code lives.

The other bug is that MLton is leaking polymorphism of string types.
This bug is not specific to my changes; svn MLton does this too.

I'm still running the regressions, no problems so far, but my ppc is  
slow.

Any improvements to the patch?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mlton-unicode-v1.patch
Type: application/octet-stream
Size: 51335 bytes
Desc: not available
Url : http://mlton.org/pipermail/mlton/attachments/20070210/8659a8a1/mlton-unicode-v1-0001.obj


More information about the MLton mailing list