[MLton] MLB file tree

Thu, 9 Feb 2006 10:39:29 +0200

Quoting Matthew Fluet <fluet@cs.cornell.edu>:
> > Attached is a draft patch for a new option "-stop ft" to print the "MLB
> > file tree".
> 
> That looks generally useful.  I wonder, though, whether the format could 
> be improved.

I'm personally not entirely happy with the format, but I'm uncertain on
how to improve it.  Let me elaborate on my requirements.

> Only one of the '{ }' and the indenting are necessary to 
> capture the nesting; the '{ }' would seem easier to handle by text 
> processing.

I want the format to be immediately useful for two purposes:
- visual examination to understand the dependency structure of a project,
  and
- processing by *simple* scripts to extract useful information, such as
  getting lists of files.

The indentation and braces are there mainly to support visual examination.
The braces, specifically, allow you to skip blocks quickly in an editor
like Emacs that supports moving by sexp's.  Using keywords (e.g. begin -
end) would already make it more complicated to browse the output.  So,
I'm uncertain on whether I want the output to be more MLish.

To support simple scripts, the filenames are not printed as string
literals.  If they were printed as string literals, then it would be much
more difficult to extract useful information out of the file tree using
standard text processing tools like grep and sed.  (The simplest approach
would probably be to implement a simple program to parse a SML string
literal and print the resulting string.)  Although I think that supporting
"special characters" (like spaces) in filenames is a good feature to have,
I also think that actually using them, for anything but legacy code, is
asking for trouble.

A third, but IMO less important requirement, is that the format should be
suitable for more complicated automated processing, such as converting the
output to be processed by Graphviz/DOT or simply computing a transitive
closure of the dependencies.  The braces should support that, by allowing
a somewhat simpler parser to be written to extract the tree (or DAG) from
the output.

I think that the main problem with the current format is that it prints
both relative and absolute paths.  I think that it would be best to only
print absolute paths.

Further requirements and specific formatting proposals are welcome.

...

BTW, while checking the output of the "-stop ft" option I noticed that
some files were elaborated twice when compiling MLton.  I recall that when
I started using MLBs, I had a problem with having some files being elaborated
more than once.

So, IMO, by default, MLton should give a warning when a specific SML
source file is elaborated more than once.  I think that in most cases it
is not intentional.  An annotation could then be implemented to allow that
warning to be disabled in specific cases.  For example:

  ann
     "ignoreMultipleElaboration"
  in
     infixes.sml
  end

> The fact that the mlb files aren't part of the output of "-stop f" is 
> probably not the right choice.  We usually use "-stop f" to extract 
> dependencies for a Makefile; technically, you could change a mlb file in a 
> manner that changes the meaning of a program without changing the list of 
> source .sml files.

I agree that MLB files should also be considered when computing
dependencies for make.

> > I implemented the option, because I wanted to know which MLB-files are
> > actually used when MLton is compiled, so that I can go through them to
> > better understand how much work it might require to get MLton to compile
> > under MLKit.
> 
> That would be very cool.  Keep us posted.  I saw your comp.lang.functional 
> posting, and it sounded like the annotations are the issue.

Annotations are one of the issues.  I already made a quick hack to MLKit
to get past non-supported annotations, but then I ran into another issue.
MLKit doesn't currently support export filters.  Supporting them probably
requires much more work, so before I try to implement export filters, I
plan to workaround them and try to get MLton to compile first (to see how
much work beyond supporting export filters might be required).

I plan to transform (either manually or with the help of some scripts) MLB
code of the form

  <--- foo.mlb --->
  local
     foo-bar-and-baz-and-more.sml
  in
     signature FOO
     structure Bar = Baz
     functor FooBar
  end
  <--- foo.mlb --->

to

  <--- foo.mlb --->
  local
     foo-bar-and-baz-and-more.sml
  in
     foo-filter.sml
  end
  <--- foo.mlb --->

  <--- foo-filter.sml --->
  signature FOO = FOO
  structure Bar = Baz
  functor FooBar (S: FOO) = FooBar (S)
  <--- foo-filter.sml --->

Which, AFAIU, should give the same meaning (or maybe there is some
complication that I can't see?).  This makes me wonder about the
importance of the export filter feature.

-Vesa Karvonen