parsing question

Stephen Weeks MLton@sourcelight.com
Mon, 26 Feb 2001 09:54:37 -0800 (PST)


> I've been bitten by this a couple of times (usually when I start a self
> compile, and come back to find that it died during parsing), so I was
> wondering if it is a bug in MLton or SML/NJ.  Essentially, what is the
> grammar for structure paths: the following program is parsed by SML/NJ,
> but MLton finds illegal tokens on lines 2, 3, and 4:
> 
> val _ = Math.sin 5.0
> val _ = Math. sin 5.0
> val _ = Math .sin 5.0
> val _ = Math . sin 5.0

I was bitten by it a couple of days ago in some code that you sent, so I sent
off a bug report to the SML/NJ folks.  The upshot: they agree it is a bug in
SML/NJ and will fix it.  Here's the mail interchange I had with them.

--------------------------------------------------------------------------------

From: "Stephen Weeks" <sweeks@intertrust.com>
To: sml-bugs@research.bell-labs.com
Subject: spaces in long identifiers
Date: Fri, 23 Feb 2001 14:24:51 -0800 (PST)


Number: *
Title:       spaces in long identifiers
Keywords:    
Submitter:   Stephen Weeks <sweeks@acm.org>
Date:        02/23/01
Version:     110.30
System:      x86-linux
Severity:    
Problem:     

SML/NJ accepts spaces around the "." in long identifiers.  It is not clear
whether the standard allows this.  MLton, the ML Kit and Moscow ML all reject
it.

Code:        

structure S = struct val x = 13 end;
val x = S . x

Transcript:  
Comments:    
Fix:         
Test: *
Owner: *
Status: *

--------------------------------------------------------------------------------

From: Dave MacQueen <dbm@research.bell-labs.com>
To: "Stephen Weeks" <sweeks@intertrust.com>
Cc: sml-bugs@research.bell-labs.com
Subject: Re: spaces in long identifiers 
Date: Fri, 23 Feb 2001 18:00:14 -0500

It is debatable whether this is a bug.  As far as I can tell, the Defn
does not specify whether spaces are allowed around the dots in long
identifiers (Section 2.4, Identifiers).  A restriction that is not
implied or required by the definition might be considered a bug, but
really we should decide on a common treatment.  Do you have arguments
in favor of the "no spaces allowed" policy, or is it just following
the convention of other languages (which might be a good enough
argument).

Dave

--------------------------------------------------------------------------------

From: "Stephen Weeks" <sweeks@intertrust.com>
To: sml-bugs@research.bell-labs.com
Subject: Re: spaces in long identifiers 
Date: Fri, 23 Feb 2001 17:56:41 -0800 (PST)

I don't have much of an argument.  Section 2.5 says "each item of lexical
analysis is a either a reserved word, a numeric label, a special constant, or a
long identifier" and "Comments and formatting characters separate items ... and
are otherwise ignored".  Together, I take these to mean that spaces separate
"S . x" into three items of lexical analysis.

--------------------------------------------------------------------------------

From: "John H. Reppy" <jhr@research.bell-labs.com>
To: "Stephen Weeks" <sweeks@intertrust.com>
cc: sml-bugs@research.bell-labs.com
Subject: Re: spaces in long identifiers 
Date: Sat, 24 Feb 2001 15:38:37 -0500


I would agree with Stephen.  My reading of 2.4 and 2.5 is that white
space characters (aka formatting characters) are allowed between lexical
items, but the "." in a long identifier is clearly not an independent
lexical item.

	- John

--------------------------------------------------------------------------------

From: Dave MacQueen <dbm@research.bell-labs.com>
To: "Stephen Weeks" <sweeks@intertrust.com>
Cc: sml-bugs@research.bell-labs.com
Subject: Re: spaces in long identifiers 
Date: Mon, 26 Feb 2001 09:54:02 -0500

I guess that is a fairly reasonable interpretation.  I think that it
would be good to make SML/NJ consistent with MLton and Moscow ML on
this point.

Dave

--------------------------------------------------------------------------------