[MLton] bug in mllex?

Michael Norrish Michael Norrish <Michael.Norrish@nicta.com.au>
Thu, 14 Apr 2005 15:23:09 +1000


I believe mllex is generating values for the internal yypos value that
are off by one.  (The generated code initialises a variable yygone0 to
be equal to 1, and I think this should probably be zero.)

The following lex file:

----------------------------------------------------------------------
datatype tok = T of string | EOF
type lexresult = tok * int
fun eof() = (EOF, 0)
%%
space = [\ \t\n];
ident = [A-Za-z]+;
%structure testlex
%%
{ident} => ((T yytext, yypos));
{space} => (lex());
----------------------------------------------------------------------

and the following driver:

----------------------------------------------------------------------
fun read_from_string s = let
  val state = ref (Substring.full s)
  fun reader n = let
    open Substring
  in
    if n >= size (!state) then string (!state) before state := full ""
    else let
        val (left, right) = splitAt (!state, n)
      in
        state := right;
        string left
      end
  end
in
  reader
end

val lexer = testlex.makeLexer (read_from_string "hello world");

val _ = let val t = ref (lexer())
            open testlex.UserDeclarations
        in
          while (#1 (!t) <> EOF) do
            (let val (T s,n) = !t
             in
               print (s ^ ": " ^ Int.toString n ^ "\n");
               t := lexer()
             end)
        end

----------------------------------------------------------------------

will when run print out

hello: 2
world: 8

with 2 and 8 supposedly being the positions of those words.  Clearly,
they should be 1 and 7 (or maybe even 0 and 6, if you believe that
character positions in a file start at zero).

Michael.