[MLton] can mlbasis rename the top-level?

Matthew Fluet fluet@cs.cornell.edu
Wed, 7 Sep 2005 10:17:42 -0400 (EDT)


> Forgive me for being stupid, but why not just use { and } ?
>
> It's easy to keep a count of the opening and closing {}s in the lexer.
> The only detail would be (afaics) to watch out for nested comments
> (since they could have unmatched {}s), but that just needs another
> state in the lexer.

It's not just comments.  Unmatched {}s can also appear in strings (and 
character constants).  And open comment delimiters in strings shouldn't 
actually start a comment.  And escaped quotes shouldn't actually close the 
string. And string quotes in comments shouldn't actually start a string. 
You're required to duplicate, in its entirety, the most complicated 
portion of the ML lexer in the MLB lexer.  Not to mention that you are 
adding the further complication of balancing {}s, a characteristic that is 
not enforced by the ML lexer (though it is by the ML parser).

(Admittedly, the MLB lexer does already know how to handle ML-style 
comments and ML-style strings (for file-name and annotations).)

But, as I said before, if you could assume that you only ever got 
syntactically well-formed source code, then there isn't any problem.  The 
difficulty arises when a syntactic-error in embedded SML code (which might 
have a recognizable/understandable error message if it were lexed/parsed 
as SML) yields an unintelligible error related to the MLB lexer/parser.

Or, as Stephen put it: "One should either completely understand the 
enclosed language or should not try to understand it at all.  Any half-way 
point is likely to be wrong or confusing, since it won't be the full 
language."