[MLton] printf via fold

Sat, 3 Sep 2005 04:04:51 +0300

Quoting Stephen Weeks <sweeks@sweeks.com>:
[...]
> These discussions are opening my eyes to new programming styles.

Mine too. [[[Some rambling follows...

In a previous life, I spent a lot of time tinkering with metaprogramming
and code generation techniques (particularly using C preprocessor macros
and C++ templates - in my defense, I found it challenging and therefore
fun ;-) - not to mention that C++ paid the bills back then). Although I've
known about combinator libraries for a long time, it seems that I have
somewhat underestimated their power. Well, there are some limits as to what
can be achieved with combinators (in particular, it is impossible to
define new binding forms), and I think that code generation can often be
more straightforward (perhaps I just need more practice with combinators).

Anyway, I originally started programming in Ocaml before really picking up
Scheme. I recall being intrigued when I noticed that Scheme's map function
allows an arbitrary number of lists. As you know, Ocaml (and SML) have a
separate mapping function for list pairs and there doesn't seem to be a
way to generalize to an arbitrary number of lists. (This is more of a
limitation of the ML type system than a limitation of combinator
techniques.)

...end of rambling]]]

Yesterday, I noticed that using the varargs Fold and the infix product
type, arbitrary list products can be manipulated fairly conveniently. The
following code could use some cleaning up, but it should convey the
general idea.

(*** a few utilities ***)

infix <\ fun x <\f = fn y => f (x, y)
infix /> fun f/> y = fn x => f (x, y)
fun pass x f = f x
fun id x = x
fun const x _ = x
fun curry f x y = f (x, y)
fun uncurry f (x, y) = f x y

datatype ('a, 'b) product = & of 'a * 'b
infix &

(*** Fold ***)

fun $ (a, f) = f a

structure Fold =
   struct
      val fold = pass
      fun step0 h (a1, f) = fold (h a1, f)
      fun step1 h u z = step0 (h z) u
   end

(*** list product module ***)

structure ListProduct =
   struct
      datatype ('null, 'hd, 'tl, 'pair, 'ls) t =
               T of {null: 'null, hd: 'hd, tl: 'tl, pair: 'pair, ls: 'ls}

      fun && $ =
          Fold.step1 (fn l =>
                         fn T t =>
                            T {null = fn l & ls => null l orelse #null t ls,
                               hd = fn l & ls => #pair t (#hd t ls, hd l),
                               tl = fn l & ls => tl l & #tl t ls,
                               pair = op&,
                               ls = l & #ls t}) $

      fun listFn f =
          Fold.fold (T {null = const false,
                        hd = id,
                        tl = id,
                        pair = fn (_, v) => v,
                        ls = ()},
                     f) &&

      fun foldl' f i (T {null, hd, tl, ls, ...}) =
          let fun lp (r, ls) = if null ls then r
                               else lp (f (hd ls, r), tl ls)
          in lp (i, ls) end

      fun app f $ = listFn (foldl' (fn (xs, ()) => f xs) ()) $
      fun foldl f i $ = listFn (foldl' f i) $
      fun map f $ = listFn (rev o foldl' (fn (xs, r) => f xs::r) []) $

      fun all' p (T {null, hd, tl, ls, ...}) =
          let fun lp ls = if null ls then true
                          else p (hd ls) andalso lp (tl ls)
          in lp ls end

      fun all p $ = listFn (all' p) $
      fun exists p $ = listFn (not o all' (not o p)) $

      fun foldr f i $ =
          listFn (fn T {null, hd, tl, ls, ...} =>
                     let fun lp ls = if null ls then i
                                     else f (hd ls, lp (tl ls))
                     in lp ls end) $
   end

(*** examples / ad hoc tests ***)

local
   structure LP = ListProduct
   val && = LP.&&
in
val x1 = LP.all (fn i & r => real i <= r)
                [1, 2] && [1.0, 2.0] $
val x2 = LP.exists (fn i & r => real i <= r)
                   [3, 4] && [1.0, 2.0] $
val x3 = LP.foldl (fn (s & i, r) => s^Int.toString i^r)
                  "\n"
                  ["a", "b", "c"] && [1,2,3,4] $
val x4 = LP.foldr op:: [] ["a", "b", "c"] && [1,2,3,4] $
val x5 = LP.app (fn i & r => print (Real.toString (real i + r) ^ "\n"))
                [1, 2, 3] && [0.1, 0.2] $
val x6 = LP.map (fn i & s & b => (~i, Char.toUpper s, not b))
                [1,3,5] && [#"a",#"b",#"c",#"d"] && [true, false, true] $
end

I plan to eventually write this into a proper reusable library with
documented signatures and simplified implementation, but I'm still
experimenting with / exploring the possibilities of the varargs technique.
For instance, I'm thinking about redesigning my simple unit test framework
using the varargs technique. It should be possible to get something like
this:

  let
     open UnitTesting
  in
     defineTests
        title "A couple of pointless tests"

        expect (fn Fail _ => ())
        body (fn () => raise Fail "just a silly example")

        body (fn () => verify (1+3 = 2+2))

        (* ... *)
  end

The idea is that options like "title" and "expect" are, well, optional. A
body specification instantiates/defines a single test. Some options would
be reset to default (like "expect") after each body, while others would be
retained (like "title"). Some properties would be handled automatically
(like numbering the tests under a title in order to allow convenient
identification of tests without having to specify a title/name for every
test).

> There is also a library cost in the signature.
[...and]
>    DSL.foo ...<no DSL> ... $
> 
> In this situation, having $ is nice.

Yep. $ seems worth exposing at the top-level.

> > The above definition of step_1_0 seems to be wrong.
[...]
> Yes.  Good catch.  At least I got the equations right :-).

[I actually only noticed the issue while looking at the definition of the
backtick combinator, which, to my surprise, omitted the identity
function.]

> > Also, I wonder if it would make sense to use curried functions [...]
> > This would allow you to use partial application (evaluation) to avoid
> > repeated computation
> 
> The makes sense in the case of step_1_0 and step_1_1.  I'm not sure
> about Fold{,r}.step1, though -- do you see the possibility to avoid
> repeat computation there or are you just going for
> consistency/concision?

Consistency, mostly. I don't actually have any plausible examples of the
use of partial application, yet, but I'm positive that such examples
exist. IOW, I think it is just a matter of time before someone comes up
with a plausible example.

> If there is no staging, then I'd argue for keeping the tupled argument,
> so that the types aren't deceiving.

Not to mention that it is also more concise.

> > The end of this message contains an implementation of this idea and
> > a simple Scanf implementation that uses the (curried) Fold library.
> 
> Nice.  It would be good if the product syntax allowed some easy way to
> denote products with zero or one component, but I can't see anything
> better than what you did.

It might be possible to design some general purpose helpers for building
products, but I haven't yet given it more serious thought. I'm not (at
least not yet) completely happy about the implementation of the above list
product technique. I'll have to think about this tomorrow (it is getting
late...).

> > In other words, the warning would disappear from the documentation (and
> > the gotcha, of course) if printing was delayed.
> 
> Yep.  I wonder why they do it that way, especially given that they
> have compiler support for their approach.  Probably just historical
> reasons.

Probably. My *guess* is that the semantics is more straightforward for the
code generator / optimizer and this is why it got specified that way
originally. I think that compared to SML, Ocaml incorporates a few
features that simplify code generation at the expense of less intuitive
semantics. For example, I recall being surprised by the difference between

  type foo = Bar of int * int

and

  type foo = Bar of (int * int)

> The way I do it is to compile some examples with one or more -keep
> flags; {ssa,ssa2,rssa}.  Then I look at the IL and see that everything
> got simplified away.  I don't know if that counts a simple :-).

Thanks. I'll try those tomorrow. I've previously spent some time looking
at the assembly/C output of MLton, but I found it a bit too difficult to
make sense out of - it was too far removed from the original SML source
code and it didn't help that the output contains the whole program
(including all libraries).

-Vesa Karvonen