[MLton] Extending the SML Basis library (Library project)

Tue Oct 17 09:10:07 PDT 2006

Quoting Stephen Weeks <sweeks at sweeks.com>:
> Matthew Fluet <fluet at cs.cornell.edu>:
> > > type ('a, 'b) iso = ('a -> 'b) * ('b -> 'a)
> > > type ('a, 'b) emb = ('a -> 'b) * ('b -> 'a option)
> > 
> > Honestly, I'm not a huge fan of the iso/emb types and functions.

I'm not sure what you mean here.  Is it the specific form of the iso/emb
types & values or something more fundamental?

I'm not too interested in the specific form of the types as long as they
aren't too verbose to use.

The iso/emb types & values (and also the bounds) basically just reify
notions that already exist in the basis library.  Once you have the
reified types & values, you can refer to them conveniently and use them as
abstractions.  Isomorphisms and embeddings are quite heavily used in
conjunction with type-indexed values (or datatype generic programming).
They are also useful in other context.  Probably the simplest use (beyond
just passing the functions as a pair) is to build conversion functions.
For example, here is a declaration from our code base (using the simple
pair representation):

  val (keyToWord, wordToKey) = Iso.<--> (swap W.large, K.large)

Here conversions between "keys" and "words" are built using composition of
isomorphisms.  (Iso.<--> has the spec (to parallel function composition o)

  val <--> : ('a, 'b) t * ('c, 'a) t -> ('c, 'b) t )

> > I would revise the types to:
> >
> >   type ('a, 'b) iso = {to: 'a -> 'b, from: 'b -> 'a}
> >   type ('a, 'b) emb = {to: 'a -> 'b, from: 'b -> 'a option}
> >
> > Then I can write:
> >
> >    #to Char.int
> >
> > and have a reminder of the direction of the coercion.

Note that direction is quite context dependent.  When you supply an
isomorphism to some function, you need to be aware of the direction.  For
example, rewriting

   fun option a =
       iso (data (C0' "NONE" +` C1' "some" a))
           (fn NONE => INL () | SOME a => INR a,
            fn INL () => NONE | INR a => SOME a)

as

   fun option a =
       iso (data (C0' "NONE" +` C1' "some" a))
           {to = fn INL () => NONE | INR a => SOME a,
            from = fn NONE => INL () | SOME a => INR a}

doesn't really, IMHO, clarify things.  Is the above correct?  It isn't!  I
flipped the direction.  The direction of the isomorphism should be "to" the
generic encoding.  You just have to know the direction.

> Those who know my SML coding style will be unsurprised to hear my
> opinion on the above.  If a type is worth naming, it's probably worth
> putting in its own structure and putting all the functions that operate
> on the type in that structure.

I have no objection to having structures Iso and Emb (actually we already
have those in a library).

> [...] I'd recommend
> 
>   structure Iso:> ISO
>
> Here, for example, is the ISO signature that I quickly hacked up for
> use in my recent stuff on type-indexed values.
> 
>   signature ISO = sig
>      type ('a, 'b) t
[...]
>      val make: ('a -> 'b) * ('b -> 'a) -> ('a, 'b) t
[...]
>   end

The major problem (sorry for the repetition if you already spotted this
from my earlier post) with this design (making Iso.t an abstract type) is
that one can no longer make polymorphic isomorphisms (courtesy of the
value restriction).  For example, it would be impossible to implement a
structure matching the following signature (from my earlier message):

signature VECTOR =
   sig
      include VECTOR
      val list : ('a vector, 'a list) iso
      val toList : 'a vector -> 'a list
   end

So, IMO, the iso/emb types must be concrete.  The only question then is
the specific form of the type.  Below are some more or less reasonable
candidates:

  type ('a, 'b) iso = ('a -> 'b) * ('b -> 'a)
  type ('a, 'b) iso = {to : 'a -> 'b, from : 'b -> 'a}
  datatype ('a, 'b) iso = ISO of ('a -> 'b) * ('b -> 'a)
  datatype ('a, 'b) iso = ISO of {to : 'a -> 'b, from : 'b -> 'a}

(BTW, I'm using Iso.iso rather than Iso.t, which I usually prefer, to be
consistent with the basis library style.)

I prefer (slightly) the first alternative due to its lightweight syntax (not
the syntax of the type, but of the values).  The move to a datatype provides
a meaningful constraint, but I doubt that it would help to catch any bugs.
Due to the context dependency (as explained above), I'm not sure whether just
moving to a record really helps (when providing ad hoc isomorphisms to library
functions).  (OTOH, the names "to" and "from" could be bound in the Iso
structure.)

-Vesa Karvonen