[MLton-user] Re: RFC: Extended Basis Library

Mon Feb 26 18:32:24 PST 2007

Vesa Karvonen wrote:
>
> Yes, I've thought about the same thing.  I think that it would be best to
> name functions by the properties that they are supposed to have (the
> properties that the user depends on) rather than by the algorithms they
> might employ (which one might want to change later as faster algorithms
> are implemented).  
    It is certainly a reasonable approach, I just wondered whether some 
people might object on the grounds that using asymptotic complexity as a 
primary specification gives a lot of leeway.  So if there eventually 
were different implementations of the extended basis, they might observe 
significantly different performance characteristics with different 
implementations.

    However, I expect that in such cases they might best just implement 
the best for their application themselves.

> This is like it is in the C++ STL, for example.  So,
> one could use a name like
>
>   sortByCmp
>
> for the generally fastest comparison based sorting function and
>
>   stableSortByCmp
>
> for the sorting function that is as fast as possible while being stable
> (O(n log n) and preserves relative order of equal elements).
>   

    Given these two specifications, which algorithms do you think you 
would select?

> Just wondering, do you use flattenOpt often?  Would it be difficult to
> rewrite the code to use mapPartial (or something else) so that options
> would not be added to lists?  (I mean, if flattenOpt is particularly
> convenient in some cases, then maybe we should just add it.)
>   

    No, it only came up in two or three places, and changing the code to 
use mapPartial (or rather my monadic version of it) seems to be clearer 
anyway.

> The main reason I assumed arbitrary precision is that the Word types
> already work like unsigned integers with modular arithmetic.
>   

    Hmm.  I hadn't really thought about using words, but that may be 
practical enough for my purposes.  Though I imagine someone still might 
like a natural number abstraction.

> I recall one article with those keywords:
>
>   http://mlton.org/References#WangMurphy .
>   

    Ah, I saw the Wizard of TILT material while I was at CMU.  While not 
quite as good as language support, the recursion scheme solution looks 
eminently practical.  Has anyone tried using something like the Fold 
library (http://mlton.org/Fold) to provide nth projections and 
injections with a single function? 

>
> Good.  As it happens, I was just working on writing concept signatures for
> Monads (and Functors) as I received your e-mail.  See the draft signatures
> with notes below.
>
>   
    [...]
> Contravariant functors are actually quite common.  I don't particularly
> like the name CFUNC, but I would prefer a short name.
>   
    These seem reasonable, but I haven't really thought much abstracting 
anything to be a functor yet.
> The Monad core signature:
>
>   signature MONAD_CORE = sig
>      type 'a monad
>      val return : 'a -> 'a monad
>      val >>= : 'a monad * ('a -> 'b monad) -> 'b monad
>   end
>   
    This is essentially the same as what I had come up with for what I 
had called BASIC_MONAD. Though I had chosen to go with the signature

    signature BASIC_MONAD = sig
       type 'a t
       val return : 'a -> 'a t
       val >>= : 'a t * ('a -> 'b t) -> 'b t
    end

As using »t« as the type seems to be convention used most other places, 
but with »where type« constraints I don't know that it really matters 
that much.

> The point here is that there is a functor
>
>   functor MkMonad (Arg : MONAD_CORE) : MONAD
>
> that makes a MONAD (with extra derived utility functions) given a
> MONAD_CORE. 
    This is again essentially the same design I have been using.

>  The MONAD signature is further split into two:
>
>   
    [...]
> The MONAD_EX signature, which one usually does not refer to explicitly,
> specifies the derived Monad operations.  The MONAD signature is just the
> conjunction of MONAD_CORE and MONAD_EX.  
    This isn't quite how I thought to split things (I was using a more 
strict hierarchy), but it is probably better given how you want to share 
between MONAD and MONADP.

> The reason for having MONAD_EX is
> to avoid duplicating specifications when specifying MONADP (MonadPlus):
>
>   signature MONADP_CORE = sig
>      include MONAD_CORE
>      val zero : 'a monad
>      val plus : 'a monad BinOp.t
>   end
>
> I'm not sure if the names zero and plus are the best.
>   

    I called them »zero« and ++. 

>   signature MONADP_EX = sig
>      type 'a monadp_ex
>      val sum : 'a monadp_ex List.t -> 'a monadp_ex
>   end
>
>   signature MONADP = sig
>      include MONADP_CORE
>      include MONAD_EX where type 'a monad_ex = 'a monad
>      include MONADP_EX where type 'a monadp_ex = 'a monad
>   end
>
> Without the EX signatures, one would have to duplicate specifications in
> MONADP.  There is also a functor
>
>   functor MkMonadP (Arg : MONADP_CORE) : MONADP
>
> for making a MONADP out of a MONADP_CORE.
>
> The set of derived operations in MONAD_EX and MONADP_EX is not complete -
> I plan to add new derived operations as I see use for them.
>   
    It might be worth either adding some additional operations to 
MONADP_CORE, or perhaps make things even more elaborate.  I have three 
other MONADP operations that cannot (usually) be defined in terms of 
MONADP_CORE. 

    val fail : 'a monad
    val once : 'a monad -> 'a monad
    val ifte : 'a monad -> ('a -> 'b monad) -> 'b monad -> 'b monad

The latter two are described in Kiselyov, Shan, Friedman, and Sabry 
(http://www.cs.rutgers.edu/~ccshan/logicprog/LogicT-icfp2005.pdf) and 
provide control over backtracking.  »fail« aborts the computation 
without any possibility for backtracking.

    Another variation I use is to actually give »zero« and »fail« arguments

       type error
       val zero : error -> 'a monad
       val fail : error -> 'a monad

and then a »where type« constraint is used to specify the nature of 
»error« values.  I'm not sure how to fit this in well.

> In addition to the above, I also have versions of the above signatures
> with an extra type parameter:
>   
    [...]
> Having two sets of signatures is a nuisance, but the extra type parameter
> is needed often enough.
>   
    It depends upon your application I expect, but while I used to use 
an extra parameter for my state monad signature, I've since switched to 
something that would look like

    signature STATE_MONAD = sig
       include MONAD_CORE
       include MONAD_EX where type 'a monad_ex = 'a monad
       type state
       val getState : state monad
       val setState : state -> unit monad
       val run : state -> 'a monad -> 'a monad
    end

> The tradeoffs have to do with the degree of laziness, possibility of space
> leaks, ease of programming and efficiency.  Even lists are lazier, but
> also likely to be slightly slower and heavier to program with than odd
> lists.  Wadler's article
>
>   http://homepages.inf.ed.ac.uk/wadler/papers/lazyinstrict/lazyinstrict.ps
>
> and SRFI-45
>
>   http://srfi.schemers.org/srfi-45/srfi-45.html
>
> are relevant here.  Memoization also has a cost and sometimes you might
> want to avoid it.
>   
    I've printed these out to read over.  Thanks.

>>     It may also be worth asking whether making lazy lists part of the 
>> Extended Basis is reasonable.  Offhand I'm not sure if there was a 
>> rationale behind not including sets and maps as part of the Standard
>> Basis, and whether the same rationale applies to the Extended Basis.
>>     
>
> Well, I think that there are too few SML libraries.  I think that lazy
> lists are something that people use often enough that it makes sense to
> provide a reasonable implementation in the Extended Basis library.
>   
    Sounds reasonable.  Given that it is probably worth considering some 
additional container signatures.  Sets and maps at the very least.

> If you don't mind the overhead of sending patches and waiting for them to
> be applied, then you don't need commit access.  Personally, I feel that it
> adds a bit of overhead and then there is the danger that small, but
> useful, additions never get suggested.
>   

    Stephen gave me commit access, but I will run patches by the list 
before making significant changes.

-- 
[Geoff Washburn|geoffw at cis.upenn.edu|http://www.cis.upenn.edu/~geoffw/]

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton-user/attachments/20070226/5ec5bb7f/attachment.html