[MLton] Re: Nitpicky definition compatibility bug, SML/NJ deviation

Robert J. Simmons rjsimmon at cs.cmu.edu
Fri Aug 26 08:56:02 PDT 2011


I am indeed interested in that pygment highlighter!

I've currently got a pull request out to Pygments for the highlighter
I worked out (https://bitbucket.org/birkenfeld/pygments-main/pull-request/14/add-support-for-standard-ml).
The one I wrote is decidedly more hackish in service of doing a bit
more than just lexing  - for instance,
http://typesafety.net/tempbovik/example.html and
http://typesafety.net/tempbovik/intsyn.fun.html (note: not permanent
links) are examples of the special treatment of datatypes and function
definitions. I'll look into seeing if I can combine the advantages of
both.

By the way I dealt with the problem of keywords like val'ue by using
the python regular expression
r'\b(... reserved words ...)\b(?!\')'
This parses only an alphanumeric sequence that is followed by
something other than a number, letter, or underscore (the \b handles
that) or a prime (the (?!\') lookahead handles that). It's surely more
efficient to handle that with a callback, though.

Another question: what's MLton doing here? SML of New Jersey rejects
this, and I tend to agree with it.

fun testfun1 ('a):''a = x
fun testfun1 (x:''a):'a = x

I came across this originally when I tried to use a tyvarseq (', '',
''', '''') in a datatype declaration and MLton complained that my type
variables were't pairwise distinct.

 - Rob

On Fri, Aug 26, 2011 at 10:03 AM, Matthew Fluet <matthew.fluet at gmail.com> wrote:
> On Fri, Aug 26, 2011 at 2:33 AM, Robert J. Simmons <rjsimmon at cs.cmu.edu> wrote:
>> MLton accepts the following program:
>>
>> structure @#$ = struct val foo = 4 end
>> val y = @#$.foo
>> val () = print "Goodbye.\n"
>>
>> However, according to Page 5 of the revised definition, "The
>> identifier class StrId is represented by alphanumeric identifiers not
>> starting with a prime," which would seem to exclude symbolic
>> identifiers from being the names of structures and signatures.
>
> Agreed.
>
>> I swear I don't go around looking for these things, I was using the
>> definition to try to write a syntax highlighter for Standard ML on
>> GitHub.
>
> You might be interested in the Pygments lexer I wrote for SML:
>  http://mlton.org/Pygments
> And the "lexical curiosities" that I discovered while writing it:
>  http://mlton.org/pipermail/mlton/2011-May/030931.html
>
> My Pygments lexer flags a lexical error in the above fragment, but
> only at the "val y = @#$.foo", since it disallows a symbolic id as a
> leading part of a long identifier.
>
>> In the process of generating
>> https://bitbucket.org/robsimmons/pygments-main/src/351f8bf6f859/tests/examplefiles/example.sml,
>> I came up with another SML/NJ vs. MLton deviation. SML/NJ does not
>> treat the single prime -'- as a type variable, nor -'0- and -'_-.
>> Similarly, SML/NJ does not treat -'''- or -'''''''- as eqtype
>> variables.
>
> Yeah, I discovered those as well.  They are, admittedly, fairly awful
> as actual tyvar identifiers in real code.
>
> -Matthew
>



-- 
Robert J. Simmons
simrob.com
gps.simrob.com
rjsimmon at cs.cmu.edu
robsimmons at gmail.com
Cell: 404-273-6890




More information about the MLton mailing list