[MLton] Ad-hoc infix identifiers, Printf, and libraries

Thu, 14 Jul 2005 19:31:55 +0300

Quoting Matthew Fluet <fluet@cs.cornell.edu>:
[...]
> Most any aspect of SML is on-topic, though you might also bounce/cc to 
> MLton-user.

Ok.  [comp.lang.ml is rather quiet and while comp.lang.functional has more
activity only a small part of the people there use SML.]

> > With a couple of simple general purpose infix operators it is possible
> > to treat any binary function as an infix operator. [...]

> I thought that was very cool.  I've been very jealous of Haskell's `id` 
> infix syntax, and I think <\id\> is fairly concise.  I had also forgotten 
> about the sectioning functions, which are also great. [...]

Thanks!  I'll take that as an invitation to add a page on the technique to
the Wiki.

> > I frequently find uses for infix operators, but it feels awkward to
> > use them in SML. [...] Unfortunately, this becomes significantly less
> > convenient unless `orWhenEq' is declared infix at the top-level, which
> > I find rather intrusive.
> 
> What is the intrusion?  Just the fact that your set of top-level infix 
> declarations may conflict with someone elses and there isn't a good way to 
> pick and choose?  Or the fact that they must be in scope everywhere?

I'd say both. SML fixity declarations have a certain macro like behavior:

  infix symbol     has implications similar to    #define symbol ...
  nonfix symbol    has implications similar to    #undef symbol

Of course, SML fixity declarations are scoped, which makes them
considerably better behaved than CPP macros.  However, each top-level
fixity declaration effectively introduces a new keyword, whose
redefinition is troublesome.  Consider, for example, the idea of
using the symbol ^ for something other than string concatenation.

It would be nice if fixity was per binding rather than per identifier.
Continuing the macro analogy, in Scheme, macro bindings can be shadowed by
ordinary bindings and vice versa.  This makes Scheme macros particularly
well behaved.  You basically don't have to worry about breaking some
existing code when you introduce a new macro in Scheme.

Introducing a top-level infix identifier in SML can easily break code far
away.  Thus, mnemonic names (like `when', `by', `bind') are pretty much
off-limits for infixing at the top-level (in a library).  For instance,
different binding strengths can be useful for an identifier in different
cases and having multiple top-level fixity declarations (with different
binding strenghts) for an identifier might cause some rather subtle bugs.

Do/Can SML compilers issue warnings for incompatible fixity declarations?
It might be worth a warning (at a strict warning level) when a single
binding is used with multiple different fixities, because it might indicate
the possibility of a "leaked" fixity declaration.

> It seems as though infix identifiers are used rarely.  I see exactly one 
> one use in the mlton sources (excluding the Basis Library declarations).

There seems to be quite a few infix declarations in the regression tests
and benchmarks.  So, I'm not sure how rare the use of infix identifiers
really is.  I suspect that some people use infix declarations considerably
more liberally than others.

> I think the real difficulty is the one you've noted -- that a library
> can't package infix declarations into the signature/structure language.
[...]
> The AliceML language has what appears to be a nice means of adding fixity 
> status to signatures/structures:
>   http://www.ps.uni-sb.de/alice/manual/modules.html#fixity

I agree.  It appears to solve the most significant problems.

> Hence, it has to rely upon extra-lingual mechanisms to make identifiers
> infix.  This mechanism can be something like the ML Basis system or it can
> be as primitive as a comment in the library highlighting the right
> cut-n-paste declarations to copy into each source file using the library.

I have previously mostly used the cut-n-paste approach, but I'm liking it
less and less by the day.  I'll probably be using the ML Basis system for
the purpose in the future.

> One of the MLKit developers was influential in the design of the ML Basis 
> system, and there remains some small hope that they will also adopt it.

That is encouraging.

> As to the maintenance issue, I don't think there is any more or less 
> difficulty finding the binding occurrence of an identifier's infix status 
> than finding the binding occurrence of an identifier.

You may be right.  Perhaps I'm worrying too much.  For application
development top-level infix declarations are reasonably safe.  For
library development the issue is more difficult.  Libraries defining
mnemonic infix identifiers are likely to cause problems.

-Vesa Karvonen