How PG'OCaml works $Id: HOW_IT_WORKS.txt,v 1.1 2007-05-24 12:59:55 rich Exp $ Enough people have asked me how PG'OCaml works that I am compiling this document to explain what is going on. 1. Background ---------------------------------------------------------------------- 1.1 PREPARE A common operation on databases is 'PREPARE statement'. What this does is to take a fixed statement and optimise it. The idea is that if you execute the same statement lots of times, instead of having the database optimise it each time (a process which can be time-consuming), you PREPARE [optimise] it once and then EXECUTE the already optimised statement. For example: PREPARE SELECT [some very complicated set of joins which take a long time to optimise] EXECUTE EXECUTE EXECUTE Most databases extend this notion with placeholders in the original statement, so: PREPARE SELECT name FROM users where id = $1 EXECUTE ($1 = 10) EXECUTE ($1 = 20) EXECUTE ($1 = 23) 1.2 DESCRIBE Recent versions of PostgreSQL added a 'DESCRIBE statement' command to the database backend. It isn't normally exposed through clients like psql or libPQ (well - that may have changed by the time you read this), but it's there if you code directly to the database frontend/backend wire protocol. 'DESCRIBE' is just an extension of 'PREPARE'. Because the optimiser has decoded the statement into some internal format, it knows already the types of the placeholders and the types of the return columns, and it can supply this information back to the caller. For example: PREPARE SELECT id, name FROM users WHERE salary > $1 DESCRIBE ==> placeholder $1 has type DECIMAL ==> 2 columns returned with types SERIAL, VARCHAR(80) 2. In OCaml ---------------------------------------------------------------------- On the OCaml side we like to know the type of everything, and using DESCRIBE we can extend type inference right the way through to the database. Consider some code like: let salary = 15000.00 in let rows = PGSQL(dbh) "SELECT id, name FROM users WHERE salary > $salary" in List.iter ( fun (id, name) -> printf "id = %d, name = %s\n" id name ) rows How do we know that salary (type: float) when passed to the database has the same type that the database expects? How do we know that the id and name fields have the same type as what the database returns? During compilation we can use DESCRIBE to convert this code into: let salary = 15000.00 in let rows = (*vv-- generated by macro --vv*) do_PREPARE "SELECT id, name FROM users WHERE salary > $1"; let placeholder1 = string_of_decimal salary (* placeholder $1 *) in let rows = do_EXECUTE placeholder1 in List.map ( fun (col1, col2) -> (* returned columns id, name *) (serial_of_string col1, string_of_string col2) ) rows (*^^-- generated by macro --^^*) List.iter ( fun (id, name) -> printf "id = %d, name = %s\n" id name ) rows Notes: (1) The real code generated by the macro is a lot more complicated. (2) The database actually takes and returns strings. (3) In the real macro, prepared statements are cached so they don't need to be reoptimised each time. 3. Further reading ---------------------------------------------------------------------- Now go and read: BUGS.txt, pGOCaml.mli, pa_pgsql.ml4