[MLton] Re: [Sml-basis-discuss] Unicode and WideChar support

Matthew Fluet fluet@cs.cornell.edu
Wed, 30 Nov 2005 09:24:46 -0500 (EST)


>>> I think that this proposal is too heavy weight for its usefulness.
>> 
>> I agree that it's pretty heavy-weight.
>> However, at least in MLton creating the structures isn't a big deal.
>> 
>>> The Basis design assumes that there is an implementation of the TEXT
>>> signature for each char/string/substring type, so you'll have all the
>>> arrays, vectors, slices, etc. for each type.
>> 
>> What if we just said that only Char and WiderChar had the structures at
>> the toplevel? All the others only provide Ucs2Text, AsciiText, ... That has
>> very little namespace pollution, yet provides everything desired. From my
>> experience with MLton's Char, most/all of these structures can be cookie-
>> cutter stamped out of a functor, so it's not much trouble to implement.
>
> Having a character type without a corresponding string/substring type seems
> weird.  Once you have string/substring, then you effectively have the vector
> and slice structures too, so why not add arrays and array slices to get the
> complete set?  My main concern is that you end up with a lot of modules that
> most users won't use or understand.

I think Wesley's proposal was not to forgoe having string/substring 
structures for distinguished character sets, but rather to only provide a 
TEXT module for them.  That is, an implementation would provide (I'm 
eliding the 'with type' constraings):

structure Char : CHAR
structure String : STRING
structure Substring : SUBSTRING
structure CharVector : MONO_VECTOR
structure CharArray : MONO_ARRAY
structure CharVectorSlice : MONO_VECTOR_SLICE
structure CharArraySlice : MONO_ARRAY_SLICE
structure Text : TEXT

structure WideChar : CHAR
structure WideString : STRING
structure WideSubstring : SUBSTRING
structure WideCharVector : MONO_VECTOR
structure WideCCharArray : MONO_ARRAY
structure WideCharVectorSlice : MONO_VECTOR_SLICE
structure WideCharArraySlice : MONO_ARRAY_SLICE
structure WideText : TEXT

structure AsciiText : TEXT
structure Ucs2Text : TEXT