[MLton] Unicode / WideChar

skaller skaller@users.sourceforge.net
Tue, 22 Nov 2005 09:11:49 +1100


On Mon, 2005-11-21 at 15:43 -0500, Adam Goode wrote:
> On Mon, 2005-11-21 at 13:06 +0100, Wesley W. Terpstra wrote:
> > ...                                             Also, the
> > String2 is UCS2, not UTF-16. 

Because it is impossible to do operations like 'substring'
on UTF-8 or UTF-16  -- or any other encoding with variable
length encodings -- efficiently. Those encodings require
parsing to count characters, they're design for streams.
[in C++ terminology, String provides random access but
UTF-n only provides a forward iterator]

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net