[MLton] straw-man packaging proposal

Wesley W. Terpstra wesley@terpstra.ca
Wed, 24 Aug 2005 14:10:18 +0200


On Aug 24, 2005, at 2:09 AM, Stephen Weeks wrote:
> All that is really
> relevant from the point of view of dependecies is the MLB files, so I
> won't mention SML code or supporting files further.  More precisely, a
> package consists of
>
>   * a unique id
>   * a name
>   * exports
>   * imports

Let's look at existing practice.

When a C library is being built on a new system, many things need to
be dealt with:

1. Parts of the program need to be turned on/off
- example: my port/rewrite of state-threads under SML uses either
kqueue or epoll depending on the host system

2. External libraries need to be located
- this can be a requirement or optional feature
- these libraries need to be installed somehow

3. External tools need to be run over the source
- mlyacc / mlnlffigen / pickle (my auto-serializer) / mldoc / ...
- these tools may need to be built first, and there is a web
   of dependencies involved here

4. The user needs to be able to specify preferences like
- turn on/off this feature
- choose this install location
- ...

There is probably more I haven't thought of.

Whether or not you like autoconf, it is undeniable that it attempts to
solve problems 1, 2, 4. There was no need to fix 3, because make
already does this.

I don't see any of these point disappearing for SML programs.
Furthermore, host systems generally have their own method of
dealing with dependencies. Let's look at debian (one of the best).

apt-get build-dep mlton

Notice that this pulls not only the required libraries, but latex for
the documentation. Furthermore, note that it also ensured that the
right versions got installed into the (system-specific) right places.
All of this (knowing where to find the right versions, downloading
them with the right protocol, recording meta-information about the
programs for later maintenance, ... whatever else gets done) is
something that generally needs to interact with the system as a
whole---not just SML-land.

No package system you propose will be able to do this. It requires
integration with the existing infrastructure of the operating system.

For that reason, I think most of what you propose is inadequate.
Since you still need the host tools, it is also redundant.

>   $(SML_LIB)/basis/
>   $(SML_LIB)/smlnj-lib/
>   $(SML_LIB)/ckit-lib/

I objected to this earlier, and I will do so again now.

The above syntax requires all the libraries to be installed in the
same location. This is not practical. It is the normal situation that
some libraries are installed in a system directory. Yet when building
a package as a user, there is invariably one package missing.
You now have two choices: get the administrator to install the
dependent package, or put it in your own libdir and symlink all
the existing system-level packages into your libdir.

Windows does not have symlinks.

I preferred Vesa's 'variables iin MLB files' approach because it
makes it reasonable to have one variable per unit of distribution.
Ie: if the SML_BASIS is shipped as one package, then it should
have one variable. The various features are represented by the
select mlb file that you put after it:

$(SML_BASIS)/basis.mlb
$(SML_BASIS)/basis-2002.mlb
$(CKIT)/ckit.mlb

Keeping all of these variables in a config.mlb at the top-level also
would make it easier to interact with the host packaging system.

> For example, an assertion might say
>
>   export <a/b/c.mlb>
>   of package <UID012404932132>
>   has meaning <MY_MEANING>
>   provided that
>      import <$(SML_LIB)/basis/basis.mlb> has meaning  
> <M_BASIS_20050901>
>      import <$(SML_LIB)/smlnj-lib/smlnj-lib.sml> has meaning  
> <M_NJ_100>

Existing approaches for versioning seem to be adequate and match the
real development life-cycle of software. A more general system has the
disadvantage of presenting yet another new thing to learn when using
MLton. Here, I think the cost outweighs the benefit. Besides, you cannot
depend on versions of external tools run via the equivalent of make.

The experience within debian suggests that what is MOST important is
that everything work the same way---not necessarily the best way. This
is part of the reason behind the push for autoconf everywhere.

I think MLB files filled an important gap that 'make' could not: symbol
binding and ordering of source files. This makes sense. SML-specific
packaging/building does not.

> So, MLB path variables continue to look like they do now.  But they no
> longer refer to paths in the filesystem.  Instead, they serve as
> "roots" of a heirarchy of package names.  Absolute references to MLB
> files (i.e. references rooted in some MLB path variable) no longer
> refer directly to MLB files in the file system -- the actual MLB files
> they refer to are only determined by an assertion context during
> elaboration.

This is interesting, but what problem does it solve?

If you need to pick the right version out of multiple versions, why
not use the same mechanism that picks the right version of 'make'
for you?

I think the packaging problem would best be solved by proposing a
system of best-practices and adhering to them:

1. all SML config variables are listed in config.mlb
   - eg: paths to the libraries to use
   - name of the mlb file to use for system-specific code (epoll/kqueue)
   - optional features are controlled by pointing to feature-enables/ 
disabled.mlb

2. the package keeps a version number in some typical symbol like  
VERSION.
   - packages can test via autoconf for the right value in VERSION
   - better if pkg-config does this, but still a VERSION symbol is handy
   - maybe the MLB could have an 'if' statement

3. exported APIs work as you described by keeping certain MLB files
     in the top-level directory that people can refer too.

4. every package provides a pkg-config file. this solves the problem
     of a dependent mlb file which needs to link some C libraries/ 
objects
     it also makes autoconf integration simple:
       PKG_CHECK_MODULES(SXML, sxml >= 1.5 <= 1.6)
       PKG_CHECK_MODULES(STML, stml >= 0.1)
       AC_ARG_ENABLE(gui,
                                           AC_HELP_STRING([--disable- 
gui],
                                                                         
         [disable the gui in package]),,
                                           [enable_gui=yes])
       if test "x$enable_gui" = "xyes"; then GUI=gui-yes.mlb; else  
GUI=gui-no.mlb; fi
       AC_SUBST(GUI)

Note that config.mlb.in would contain:
var SXML = @SXML@
var STML = @STML@
var GUI = @GUI@

Note also that with a trivial autoconf macro, the AC_ARG_ENABLE stuff
up to AC_SUBST could be reduced to:
AC_SML_ENABLE(gui, [--disable-gui], [disable the gui in package], no)

This might not be as pretty or as general, but existing practice has  
shown
that autoconf gets the job done. More importantly, since autoconf  
appeared,
most programs build the same way, regardless of programming language.
For debian, autoconf/automake projects have greatly simplified packaging
simply because it is a common build-system in use in many places.

I think making our own build/package-system is going to make SML
libraries and programs less attractive, not more.

Furthermore, versioning and dependencies need system support.
Trying to pull this information away from the expert tools that manage
software on the host system is probably going to cause trouble.