Skip to main content

Extract scripts from a reST document and apply them in order.

Project description

Building and maintaining the schema of a database is always a challenge. It may quickly become a nightmare when dealing with even moderately complex databases, in a distribuited development environment. You have new features going in, and fixes here and there, that keeps accumulating in the development branch. You also have several already deployed instances of the database you wanna upgrade now and then.

In my experience, it’s very difficult to impossible to come up with a completely automated solution, for several reasons:

  • comparison between different releases of a database schema is tricky
  • actual contents of the database must be preserved
  • some changes require specific recipes to upgrade the data
  • any automated solution hide some detail, by definition: I need complete control, to be able to create temporary tables and/or procedures for example

I tried, and wrote myself, several different approaches to the problem, and this package is my latest and most satisfying effort: it builds on top of docutils and Sphinx, with the side advantage that you get a quite nice and good documentation of the whole architecture: literate database scheming!

How it works

The package contains two distinct pieces: a Sphinx extension and the patchdb command line tool.

The extension implements a new ReST directive able to embed a script in the document: when processed by the sphinx-build tool, all the scripts will be collected in an external file, configurable.

The patchdb tool takes that script collection and determines which scripts need to be applied to some database, and the right order.

It keeps and maintains a single very simple table within the database, where it records the last version of each script it successfully execute, so that it won’t reexecute the same script (actually, a particular revision of it) twice.

So, on the development side you simply write (and document!) each piece, and when it comes the time of deploying current state you distribute just the script collection (a single file, usually in YAML or JSON format, or a pickle archive) to the end points where the database instances live, and execute patchdb against each one.


The basic building block is a script, an arbitrary sequence of statements written in some language (currently, either Python, SQL or Shell), augmented with some metadata such as the scriptid, possibly a longer description, its revision and so on.

As a complete example of the syntax, consider the following:

.. patchdb:script:: My first script
   :description: Full example of a script
   :revision: 2
   :depends: Other script@4
   :preceeds: Yet another
   :language: python
   :conditions: python_2_x

   print "Yeah!"

This will introduce a script globally identified by My first script, written in Python: this is its second release, and its execution must be constrained such that it happens after the execution of the fourth revision of Other script and before Yet another.

The example shows also an usage of the conditions, allowing more than one variant of a script like:

.. patchdb:script:: My first script (py3)
   :description: Full example of a script
   :revision: 2
   :depends: Other script@4
   :preceeds: Yet another
   :language: python
   :conditions: python_3_x


The dependencies may be a comma separated list of script ids, such as:

.. patchdb:script:: Create master table


.. patchdb:script:: Create target table


.. patchdb:script:: Add foreign key to some_table
   :depends: Create master table, Create target table

   ALTER TABLE some_table
         ADD CONSTRAINT fk_master_target
             FOREIGN KEY (tt_id) REFERENCES target_table (id)

Independently from the order these scripts appear in the documentation, the third script will execute only after the first two get successfully applied to the database. As you can notice, most of the options are optional: by default, :language: is sql, :revision: is 1, the :description: is taken from the title (that is, the script ID), while :depends: and :preceeds: are empty.

Just for illustration purposes, the same effect could be achieved with:

.. patchdb:script:: Create master table
   :preceeds: Add foreign key to some_table


.. patchdb:script:: Create target table


.. patchdb:script:: Add foreign key to some_table
   :depends: Create target table

   ALTER TABLE some_table
         ADD CONSTRAINT fk_master_target
             FOREIGN KEY (tt_id) REFERENCES target_table (id)


A patch is a particular flavour of script, one that specify a brings dependency list. Imagine that the example above was the first version of the database, and that the current version looks like the following:

.. patchdb:script:: Create master table
   :revision: 2

   CREATE TABLE some_table (
     description VARCHAR(80),
     tt_id INTEGER

that is, some_table now contains one more field, description.

We need an upgrade path from the first revision of the table to the second:

.. patchdb:script:: Add a description to the master table
   :depends: Create master table@1
   :brings: Create master table@2

   ALTER TABLE some_table ADD COLUMN description VARCHAR(80)

When patchdb examines the database status, it will execute one or the other. If the script Create master table isn’t executed yet (for example when operating on a new database), it will take the former script (the one that creates the table from scratch). Otherwise, if the database “contains” revision 1 (and not higher than 1) of the script, it will execute the latter, bumping up the revision number.

Run-always scripts

Yet another variant of scripts, which gets applied always, every time patchdb is executed. This kind may be used to perform arbitrary operations, either at the start or at the end of the patchdb session:

.. patchdb:script:: Say hello
   :language: python
   :always: first


.. patchdb:script:: Say goodbye
   :language: python
   :always: last



Collecting patches

To use it, first of all you must register the extension within the Sphinx environment, adding the full name of the package to the extensions list in the file, for example:

# Add any Sphinx extension module names here, as strings.
extensions = ['metapensiero.sphinx.patchdb']

The other required bit of customization is the location of the on disk scripts storage, i.e. the path of the file that will contain the information about every found script: this is kept separated from the documentation itself because you will probably deploy on production servers just to update their database. If the filename ends with .json it will contain a JSON formatted array, if it ends with .yaml the information will be dumped in YAML, otherwise it will be a Python pickle. I usually prefer JSON or YAML, because those formats are more VCs friendly and open to human inspection.

The location may be set in the same as above, like:

# Location of the external storage
patchdb_storage = '…/dbname.json'

Otherwise, you can set it using the -D option of the sphinx-build command, so that you can easily share its definition with other rules in a Makefile. I usually put the following snippet at the beginning of the Makefile created by sphinx-quickstart:

TOPDIR ?= ..
STORAGE ?= $(TOPDIR)/database.json

SPHINXOPTS = -D patchdb_storage=$(STORAGE)

At this point, executing the usual make html will update the scripts archive: that file contains everything is needed to update the database either local or remote; in other words, running Sphinx (or even having it installed) is not required to update a database.

Updating the database

The other side of the coin is managed by the patchdb tool, that digests the scripts archive and is able to determine which of the scripts are not already applied and eventually does that, in the right order.

When your database does already exist and you are just starting using patchdb you may need to force the initial state with the following command:

patchdb --assume-already-applied --postgres "dbname=test" --patch-storage database.json

that will just update the patchdb table registering current revision of all the missing scripts, without executing them.

You can inspect what will be done, that is obtain the list of not already applied patches, with a command like:

patchdb --dry-run --postgres "dbname=test" -s database.json

The database.json archive can be sent to the production machines (in some cases I put it in a production branch of the repository and use the version control tool to update the remote machines, in other I simply used scp or rsync based solutions).

Example development Makefile snippet

The following is a snippet that I usually put in my outer Makefile:

export TOPDIR := $(CURDIR)
DBHOST := localhost
DBPORT := 5432
DBNAME := dbname
DROPDB := dropdb --host=$(DBHOST) --port=$(DBPORT)
CREATEDB := createdb --host=$(DBHOST) --port=$(DBPORT) --encoding=UTF8
DSN := host=$(DBHOST) port=$(DBPORT) dbname=$(DBNAME)
PUP := $(PATCHDB) --patch-storage=$(STORAGE) \
                  --postgres="$(DSN)" --log-file=$(DBNAME).log

# Build the Sphinx documentation
        $(MAKE) -C doc STORAGE=$(STORAGE) html

$(STORAGE): doc

# Show what is missing
missing-patches: $(STORAGE)
        $(PUP) --dry-run

# Upgrade the database to the latest revision
database: $(STORAGE)

# Remove current database and start from scratch
        -$(DROPDB) $(DBNAME)
        $(CREATEDB) $(DBNAME)
        $(MAKE) database


1.2.0 (2014-06-19)

  • New “run-always” scripts
  • Poor man “CREATE DOMAIN” for MySQL
  • User defined assertions

1.1.2 (2014-06-05)

  • New –assume-already-applied option, useful when you start using patchdb on an already existing database

1.1.1 (2014-06-03)

  • Fix packaging, adding a

1.1.0 (2014-06-03)

  • Use setuptools instead of distribute
  • Use argparse instead of optparse
  • New mimetype property on scripts, to select the right Pygments highlighter
  • New MySQL specific context, using cymysql

1.0.7 (2013-08-23)

  • published on bitbucket

1.0.6 (2013-03-12)

  • dbloady: ability to load field values from external files

1.0.5 (2013-03-11)

  • dbloady: fix encoding error when printing messages coming from PostgresQL
  • dbloady: emit a progress bar on stderr

1.0.4 (2013-02-27)

  • dbloady, a new utility script, to load base data from a YAML stream.

1.0.3 (2012-11-07)

  • Fix :patchdb:script role

1.0.2 (2012-10-19)

  • Pickier way to split the multi-statements SQL scripts, now the ;; separator must be on a line by its own
  • More precise line number tracking when applying multi-statements SQL scripts
  • Dump and load script dependencies and conditions as lists, to avoid pointless repeated splits and joins

1.0.1 (2012-10-13)

  • Fix error loading JSON storage, simplejson already yields unicode strings
  • Possibly use the original title of the script as description, if not explicitly set
  • More precise error on unknown script reference
  • Minor corrections

1.0 (2012-10-10)

  • Added JSON support for the on disk scripts storage

  • Adapted to work with SQLAlchemy 0.7.x

  • Updated to work with docutils > 0.8

  • Refactored as a Sphinx domain


    This means that the directive names are now prefixed with patchdb: (that is, the old script directive is now patchdb:script). You can use the default-domain directive if that annoys you.

  • Renamed the status table from prst_applied_info to simply patchdb


    This is the main incompatible change with previous version: you should eventually rename the table manually, sorry for the inconvenience.

  • Renamed prst_patch_storage configuration setting to patchdb_storage

  • Each script ID is now lower case, to avoid ambiguities

0.3 (2010-11-14)

  • Updated to work with Sphinx 1.0
  • New :script: role for cross-references
  • New :file: option on script directive, to keep the actual text in an external file

0.2 (2010-03-03)

  • Compatibility with SQLAlchemy 0.6
  • New patchdb command line tool

0.1 (2009-10-28)

  • Replace home brew solution with SQLAlchemy topological sort
  • Use YAML for the persistent storage
  • Mostly working Sphinx adaptor
  • Rudimentary and mostly untested SQLAlchemy backend (basically only the direct PostgresQL backend has been battle tested in production…)
  • First standalone version


  • still a PylGAM side-product
  • simply a set of docutils directives
  • started with Firebird in mind, but grown up with PostgresQL

Project details

Release history Release notifications

History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


This version
History Node


History Node


History Node


History Node


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
metapensiero.sphinx.patchdb-1.2.0.tar.gz (30.9 kB) Copy SHA256 hash SHA256 Source None Jun 18, 2014

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page