[MLton] SQLite3 MLton binding

Wesley W. Terpstra terpstra at gkec.tu-darmstadt.de
Tue Feb 6 15:35:19 PST 2007


The reason I've been mucking around with Fold and friends is for an  
SQL interface for MLton. You can find it here:
	svn+ssh://mlton.org/svnroot/mltonlib/trunk/ca/terpstra/sqlite3

In case you're not familiar with it, SQLite3 is a serverless SQL  
engine. It weighs about 170kB (can be statically linked) and stores  
all of the database into a single file. It's relatively fast, and (I  
think) it's ideal for end-user applications. You don't need to  
configure it, it doesn't need a server, and it doesn't have external  
dependencies. Find out more here:
	http://www.sqlite.org/

At any rate, I've set out to bind it to SML. At the moment, it works  
like this:

local
   open SQL.Template
in
   (* Templates that can be instantiated to create a query *)
   val T1: SQL.db -> string -> int -> (string -> int -> real -> XX,  
XX) SQL.query
             = query "select x, y, z from mytable where x="iS" and  
y="iI";" oS oI oR $
end

val db = SQL.openDB "mydatabase"

val Q1 = T1 db "hiho" 6

fun f s i r = XX
val x = SQL.map f Q1
fun g s i r = XX
val y = SQL.map f Q1
val () = SQL.close Q1

val () = SQL.closeDB db

As you can see from the example code, in the program prelude you list  
out all the templates for queries you might want to execute. Later,  
you pass them a database and the arguments required, to get a query.  
You can then iterate over the query (and cancel it if you found  
enough data) and map the query to create a vector.

When creating a template, you put i{B,R,I,Z,S,X} in the string to  
accept a Word8Vector.vector, real, int, Int64.int, string, or  
"storage" in the function created. These are then passed to SQLite3  
via the C interface methods. "storage" is simply a "type-less" entry  
I expose as a datatype. These correspond to blobs, doubles, ints,  
int64, and text in SQLite3 terminology.

At the end of a template, you put the output column types. o 
{B,R,I,Z,S,X} as before. These are converted by SQLite3 before being  
imported into the GC'd memory of MLton. If you don't know the number  
of columns to expect---for example you executed a user's query---then  
use oA{B,R,I,Z,S,X} which will bind a parameter vector of the  
appropriate type. eg:
   val T1 : SQL.db -> (string vector -> XX, XX) SQL.query
               = query user_query_string oAS $

Where I'd like to go from here:

I don't like the XX return type. If Fold and friends could generate  
polymorphic methods, that would be great. As it is, you have to  
specify the XX in the type binding. This makes the 'map' function  
something of a lie. I've been considering Stephan's Fold01N to create  
a tuple as a return value instead of calling a function. Then it  
would be possible to support map properly (and also app). It would  
also make iteration more convenient. However,

1. SQLite3 is very fast, and I'd rather not make it slower. If I use  
the (x & y & z) idiom, how much will it cost me in speed? The old  
Printf page mentioned that MLton optimizes all of this trickery away;  
which intermediate language do I look at to confirm this?

2. The type signature for (x & y & z & ...) quickly gets hideous. I'd  
rather spare users from seeing this.

The other thing I can't decide is whether or not it makes sense to  
reuse a query. The original SQLite3 interface allows you to 'reset' a  
query after it completes, so that you can use it again. This is what  
I currently expose in the SML binding. However, I think that an  
'autoclose' behaviour might be more convenient. According to the  
manual, reusing queries can be much faster than reparsing the SQL.




More information about the MLton mailing list