[MLton] printf with no infixes

Stephen Weeks MLton@mlton.org
Thu, 25 Aug 2005 11:12:24 -0700


> The idea is also very simple: the continuation of the formatting
> sequence is always determined by the immediately following specifier
> (` or something else). Below is a quick implementation of the
> alternative syntax.

That's a nice idea.  Here is a refactoring of your code along the same
lines as the other examples I've sent.

----------------------------------------------------------------------
signature PRINTF =
   sig
      type ('a, 'b) v
      type ('a, 'b, 'c) u = (('a, 'b) v -> 'c) -> 'c
      type ('x, 'a, 'b, 'c) f = ('x -> 'a, 'b) v -> ('a, 'b, 'c) u
         
      val $ : (unit, 'a) v -> 'a
      val ` : ('a, 'b) v -> string -> ('a, 'b, 'c) u
      val C : (char, 'a, 'b, 'c) f
      val D : (int, 'a, 'b, 'c) f
      val G : (real, 'a, 'b, 'c) f
      val newFormat: ('x -> string) -> ('x, 'a, 'b, 'c) f
      val newFormatWithArg:
         ('z * 'x -> string) -> ('x -> 'a, 'b) v -> 'z -> ('a, 'b, 'c) u
      val printf: ('a, 'a, 'b) u
   end

functor F (S: PRINTF) =
   struct
      open S
         
      val () = printf `"Hello.\n"$
      val () = printf `"An int "D`" and an int "D`".\n"$ 13 14
      val () = printf `"An int "D`" and a real "G`".\n"$ 13 3.1415
      val () = printf `"A real "G`" and a real "G`".\n"$ 13.1 3.1415
         
      val () = printf $
      val () = printf `"A string" `" - followed by another.\n"$
      val () = printf G C `"\n"$ 1.0 #"f"

      val DL =
         fn z =>
         newFormatWithArg
         (fn (n, i) => StringCvt.padRight #" " n (Int.toString i))
         z
      val () = printf `"A padded int '"DL 5`"'.\n"$ 12
   end

structure Printf:> PRINTF =
   struct
      type 'a k = string list -> 'a
      type ('a, 'b) v = 'a k -> 'b k
      type ('a, 'b, 'c) u = (('a, 'b) v -> 'c) -> 'c
      type ('x, 'a, 'b, 'c) f = ('x -> 'a, 'b) v -> ('a, 'b, 'c) u
         
      fun id x = x
      fun printf f = f id
      fun $ m = m (fn ss => List.app print (rev ss)) []
      fun ` m s f = f (fn k => m (fn ss => k (s::ss)))
 
      fun newFormat toS m f = f (fn k => m (fn ss => fn x => k (toS x::ss)))
 
      val C = fn z => newFormat Char.toString z
      val D = fn z => newFormat Int.toString z
      val G = fn z => newFormat Real.toString z

      fun newFormatWithArg toS m arg f =
         f (fn k => m (fn ss => fn x => k (toS (arg, x)::ss)))
   end

structure Z = F (Printf)
----------------------------------------------------------------------

I've also reworked my code a little to support format-specifier
arguments and consecutive strings, and to make a comparison between
the two approaches visually easy.

----------------------------------------------------------------------
signature PRINTF =
    sig
       type ('a, 'b) v
       type ('a, 'b, 'c) u = string -> (('a, 'b) v -> 'c) -> 'c
       type ('x, 'a, 'b, 'c) f = ('x -> 'a, 'b) v -> ('a, 'b, 'c) u

       val $ : (unit, 'a) v -> 'a
       val ` : ('a, 'b) v -> ('a, 'b, 'c) u
       val C: (char, 'a, 'b, 'c) f
       val D: (int, 'a, 'b, 'c) f
       val G: (real, 'a, 'b, 'c) f
       val newFormat: ('x -> string) -> ('x, 'a, 'b, 'c) f
       val newFormatWithArg:
          ('z * 'x -> string) -> ('x -> 'a, 'b) v -> 'z -> ('a, 'b, 'c) u
       val printf: ('a, 'a, 'b) u
    end

functor F (S: PRINTF) =
   struct
      open S

      val () = printf "Hello.\n"$
      val () = printf "An int "D" and an int "D".\n"$ 13 14
      val () = printf "An int "D" and a real "G".\n"$ 13 3.1415
      val () = printf "A real "G" and a real "G".\n"$ 13.1 3.1415

      val () = printf "" $
      val () = printf "A string"`" - followed by another.\n"$
      val () = printf "" G "" C "\n"$ 1.0 #"f"

      val DL =
         fn z =>
         newFormatWithArg
         (fn (n, i) => StringCvt.padRight #" " n (Int.toString i))
         z
      val () = printf "A padded int '"DL 5"'.\n"$ 12
   end

structure Printf: PRINTF =
   struct
      type 'a k = string list -> 'a
      type ('a, 'b) v = string * ('a k -> 'b k)
      type ('a, 'b, 'c) u = string -> (('a, 'b) v -> 'c) -> 'c
      type ('x, 'a, 'b, 'c) f = ('x -> 'a, 'b) v -> ('a, 'b, 'c) u

      fun printf s f = f (s, fn k => k)

      fun $ (s, m) = m (fn ss => List.app print (rev (s :: ss))) []

      fun ` (s, m) s' f = f (s', fn k => m (fn ss => k (s :: ss)))

      fun newFormat toString (s, m) s' f =
         f (s', fn k => m (fn ss => fn x => k (toString x :: s :: ss)))

      fun newFormatWithArg toString v a =
         newFormat (fn x => toString (a, x)) v

      val C = fn z => newFormat Char.toString z
      val D = fn z => newFormat Int.toString z
      val G = fn z => newFormat Real.toString z
   end

structure Z = F (Printf)
----------------------------------------------------------------------

Comparing the two signatures, it is easy to see from the "u" type that
the only difference between the approaches is that mine requires a
string to precede each directive.  That's really it.  Both support
sequences of format characters, sequences of strings, or (possibly
vararg or optional-arg) format-specifier arguments.

> The issue is that you can't have two consecutive conversion
> specifiers nor two consecutive literal strings. I'm not saying that
> it would be a common need; the above syntax probably does 90% (or
> more) of the cases without surprise.

It is true that one cannot write 'G C' and instead must write 'G "" C'
but I don't find the verbosity of an unusual case a compelling reason
to make the common case more verbose.  As to consecutive literal
strings, the same backtick syntax (with a slightly different meaning)
works in both.

> Consider the following regular expression for the above printf syntax
> 
>   printf (<string> <conversion-specifier>)* <string> $ <arg>*
> 
> and contrast it with the following alternative printf syntax
> 
>   printf (`<string> | <conversion-specifier> <arg>*)* $ <arg>*
> 
> The alternative syntax allows an empty formatting sequence (meaning the
> stuff between printf and $) and arbitrary number of consecutive literal
> strings (prefixed by `) or conversion specifiers. IMO, this makes the
> syntax considerably more regular and probably reduces surprises. 

I find them both simple, and equally regular and easy to understand.
One requires strings before directives and the other requires
directives first.  And the signatures make it clear to me that
there is no substantive difference.

Here are the uncommon cases with both approaches, set side by side.

      val () = printf $
      val () = printf "" $

      val () = printf G C `"\n"$ 1.0 #"f"
      val () = printf "" G "" C "\n"$ 1.0 #"f"

And here are the common ones.

      val () = printf `"Hello.\n"$
      val () = printf "Hello.\n"$

      val () = printf `"An int "D`" and an int "D`".\n"$ 13 14
      val () = printf "An int "D" and an int "D".\n"$ 13 14

      val () = printf `"An int "D`" and a real "G`".\n"$ 13 3.1415
      val () = printf "An int "D" and a real "G".\n"$ 13 3.1415

      val () = printf `"A real "G`" and a real "G`".\n"$ 13.1 3.1415
      val () = printf "A real "G" and a real "G".\n"$ 13.1 3.1415

      val () = printf `"A string" `" - followed by another.\n"$
      val () = printf "A string"`" - followed by another.\n"$

      val () = printf `"A padded int '"DL 5`"'.\n"$ 12
      val () = printf "A padded int '"DL 5"'.\n"$ 12

I'd prefer to have the cleaner common case.