Extract scripts from a reST document and apply them in order.
Project description
- Version:
- 1.1.0
- license:
GPLv3
Building and maintaining the schema of a database is always a challenge. It may quickly become a nightmare when dealing with even moderately complex databases, in a distribuited development environment. You have new features going in, and fixes here and there, that keeps accumulating in the development branch. You also have several already deployed instances of the database you wanna upgrade now and then.
In my experience, it’s very difficult to impossible to come up with a completely automated solution, for several reasons:
comparison between different releases of a database schema is tricky
actual contents of the database must be preserved
some changes require specific recipes to upgrade the data
any automated solution hide some detail, by definition: I need complete control, to be able to create temporary tables and/or procedures for example
I tried, and wrote myself, several different approaches to the problem, and this package is my latest and most satisfying effort: it builds on top of docutils and Sphinx, with the side advantage that you get a quite nice and good documentation of the whole architecture: literate database scheming!
How it works
The package contains two distinct pieces: a Sphinx extension and the patchdb command line tool.
The extension implements a new ReST directive able to embed a script in the document: when processed by the sphinx-build tool, all the scripts will be collected in an external file, configurable.
The patchdb tool takes that script collection and determines which scripts need to be applied to some database, and the right order.
It keeps and maintains a single very simple table within the database, where it records the last version of each script it successfully execute, so that it won’t reexecute the same script (actually, a particular revision of it) twice.
So, on the development side you simply write (and document!) each piece, and when it comes the time of deploying current state you distribute just the script collection (a single file, usually in AXON, JSON or YAML format, or a pickle archive) to the end points where the database instances live, and execute patchdb against each one.
Scripts
The basic building block is a script, an arbitrary sequence of statements written in some language (currently, either Python, SQL or Shell), augmented with some metadata such as the scriptid, possibly a longer description, its revision and so on.
As a complete example of the syntax, consider the following:
.. patchdb:script:: My first script :description: Full example of a script :revision: 2 :depends: Other script@4 :preceeds: Yet another :language: python :conditions: python_2_x print "Yeah!"
This will introduce a script globally identified by My first script, written in Python: this is its second release, and its execution must be constrained such that it happens after the execution of the fourth revision of Other script and before Yet another.
The example shows also an usage of the conditions, allowing more than one variant of a script like:
.. patchdb:script:: My first script (py3) :description: Full example of a script :revision: 2 :depends: Other script@4 :preceeds: Yet another :language: python :conditions: python_3_x print("Yeah!")
The dependencies may be a comma separated list of script ids, such as:
.. patchdb:script:: Create master table CREATE TABLE some_table (id INTEGER PRIMARY KEY, tt_id INTEGER) .. patchdb:script:: Create target table CREATE TABLE target_table (id INTEGER PRIMARY KEY) .. patchdb:script:: Add foreign key to some_table :depends: Create master table, Create target table ALTER TABLE some_table ADD CONSTRAINT fk_master_target FOREIGN KEY (tt_id) REFERENCES target_table (id)
Independently from the order these scripts appear in the documentation, the third script will execute only after the first two get successfully applied to the database. As you can notice, most of the options are optional: by default, :language: is sql, :revision: is 1, the :description: is taken from the title (that is, the script ID), while :depends: and :preceeds: are empty.
Just for illustration purposes, the same effect could be achieved with:
.. patchdb:script:: Create master table :preceeds: Add foreign key to some_table CREATE TABLE some_table (id INTEGER PRIMARY KEY, tt_id INTEGER) .. patchdb:script:: Create target table CREATE TABLE target_table (id INTEGER PRIMARY KEY) .. patchdb:script:: Add foreign key to some_table :depends: Create target table ALTER TABLE some_table ADD CONSTRAINT fk_master_target FOREIGN KEY (tt_id) REFERENCES target_table (id)
Patches
A patch is a particular flavour of script, one that specify a brings dependency list. Imagine that the example above was the first version of the database, and that the current version looks like the following:
.. patchdb:script:: Create master table :revision: 2 CREATE TABLE some_table ( id INTEGER PRIMARY KEY, description VARCHAR(80), tt_id INTEGER )
that is, some_table now contains one more field, description.
We need an upgrade path from the first revision of the table to the second:
.. patchdb:script:: Add a description to the master table :depends: Create master table@1 :brings: Create master table@2 ALTER TABLE some_table ADD COLUMN description VARCHAR(80)
When patchdb examines the database status, it will execute one or the other. If the script Create master table isn’t executed yet (for example when operating on a new database), it will take the former script (the one that creates the table from scratch). Otherwise, if the database “contains” revision 1 (and not higher than 1) of the script, it will execute the latter, bumping up the revision number.
Run-always scripts
Yet another variant of scripts, which gets applied always, every time patchdb is executed. This kind may be used to perform arbitrary operations, either at the start or at the end of the patchdb session:
.. patchdb:script:: Say hello :language: python :always: first print("Hello!") .. patchdb:script:: Say goodbye :language: python :always: last print("Goodbye!")
Usage
Collecting patches
To use it, first of all you must register the extension within the Sphinx environment, adding the full name of the package to the extensions list in the file conf.py, for example:
# Add any Sphinx extension module names here, as strings. extensions = ['metapensiero.sphinx.patchdb']
If you want to take advantage of the augmented DataDocumenter, add also metapensiero.sphinx.patchdb.autodoc_sa to that list.
The other required bit of customization is the location of the on disk scripts storage, i.e. the path of the file that will contain the information about every found script: this is kept separated from the documentation itself because you will probably deploy it on production servers just to update their database.
The location may be set in the same conf.py as above, like:
# Location of the external storage patchdb_storage = '…/dbname.json'
Otherwise, you can set it using the -D option of the sphinx-build command, so that you can easily share its definition with other rules in a Makefile. I usually put the following snippet at the beginning of the Makefile created by sphinx-quickstart:
TOPDIR ?= .. STORAGE ?= $(TOPDIR)/database.json SPHINXOPTS = -D patchdb_storage=$(STORAGE)
At this point, executing the usual make html will update the scripts archive: that file contains everything is needed to update the database either local or remote; in other words, running Sphinx (or even having it installed) is not required to update a database.
Updating the database
The other side of the coin is managed by the patchdb tool, that digests the scripts archive and is able to determine which of the scripts are not already applied and eventually does that, in the right order.
When your database does already exist and you are just starting using patchdb you may need to force the initial state with the following command:
patchdb --assume-already-applied --postgres "dbname=test" --patch-storage database.json
that will just update the patchdb table registering current revision of all the missing scripts, without executing them.
You can inspect what will be done, that is obtain the list of not already applied patches, with a command like:
patchdb --dry-run --postgres "dbname=test" -s database.json
The database.json archive can be sent to the production machines (in some cases I put it in a production branch of the repository and use the version control tool to update the remote machines, in other I simply used scp or rsync based solutions).
Example development Makefile snippet
The following is a snippet that I usually put in my outer Makefile:
export TOPDIR := $(CURDIR) DBHOST := localhost DBPORT := 5432 DBNAME := dbname DROPDB := dropdb --host=$(DBHOST) --port=$(DBPORT) CREATEDB := createdb --host=$(DBHOST) --port=$(DBPORT) --encoding=UTF8 STORAGE := $(TOPDIR)/$(DBNAME).json DSN := host=$(DBHOST) port=$(DBPORT) dbname=$(DBNAME) PUP := $(PATCHDB) --patch-storage=$(STORAGE) \ --postgres="$(DSN)" --log-file=$(DBNAME).log # Build the Sphinx documentation doc: $(MAKE) -C doc STORAGE=$(STORAGE) html $(STORAGE): doc # Show what is missing missing-patches: $(STORAGE) $(PUP) --dry-run # Upgrade the database to the latest revision database: $(STORAGE) $(PUP) # Remove current database and start from scratch scratch-database: -$(DROPDB) $(DBNAME) $(CREATEDB) $(DBNAME) $(MAKE) database
Changes
1.4.2 (2015-10-22)
Allow using keyed values (e.g. PostgreSQL HSTORE) to lookup instances in dbloady
1.4.1 (2015-09-23)
Augmented Sphinx autodoc DataDocumenter able to pretty print SA queries
1.4.0 (2015-08-19)
New experimental dbloady feature, mainly intendended for test fixtures: it is now able to take note about the instances it creates writing a YAML file with the same input format, and delete them from the database in a subsequent run
1.3.11 (2015-08-16)
dbloady now flushes changes after each entity to honor referential integrity checks
1.3.10 (2015-08-15)
Fix problem with the patchdb:script role, when the target gets splitted on two or more lines
1.3.9 (2015-08-08)
Fix problem with different MySQL drivers exceptions internals
1.3.8 (2015-08-08)
Allow longer patch ids, up to 100 characters
1.3.7 (2015-07-20)
Use PyMySQL by default, allow selection of a different driver with a command line option
1.3.6 (2015-07-06)
Do not decode patch id from UTF-8 but let the driver do that if needed
1.3.5 (2015-07-06)
Fix type of MySQL port number, must be an integer
1.3.4 (2015-07-06)
Accept also the port number to reach the MySQL server
1.3.3 (2015-06-24)
Some more tweaks to adapt dbloady to Python 3
1.3.2 (2015-06-23)
Flush the standard error stream to show the progress immediately
Do not encode statements in UTF-8 but let the driver do that if needed
1.3.1 (2015-06-23)
Fix “brown paper bag” syntax error
1.3.0 (2015-06-21)
Use fdb instead of kinterbasdb for Firebird
Support the AXON format for the on disk patch storage
1.2.1 (2014-07-02)
Add script’s “conditions” and “run-always” to the sphinx rendering
dbloady’s load_yaml() now returns a dictionary with loaded instances
1.2.0 (2014-06-19)
New “run-always” scripts
Poor man “CREATE DOMAIN” for MySQL
User defined assertions
1.1.2 (2014-06-05)
New –assume-already-applied option, useful when you start using patchdb on an already existing database
1.1.1 (2014-06-03)
Fix packaging, adding a MANIFEST.in
1.1.0 (2014-06-03)
Use setuptools instead of distribute
Use argparse instead of optparse
New mimetype property on scripts, to select the right Pygments highlighter
New MySQL specific context, using cymysql
1.0.7 (2013-08-23)
published on bitbucket
1.0.6 (2013-03-12)
dbloady: ability to load field values from external files
1.0.5 (2013-03-11)
dbloady: fix encoding error when printing messages coming from PostgresQL
dbloady: emit a progress bar on stderr
1.0.4 (2013-02-27)
dbloady, a new utility script, to load base data from a YAML stream.
1.0.3 (2012-11-07)
Fix :patchdb:script role
1.0.2 (2012-10-19)
Pickier way to split the multi-statements SQL scripts, now the ;; separator must be on a line by its own
More precise line number tracking when applying multi-statements SQL scripts
Dump and load script dependencies and conditions as lists, to avoid pointless repeated splits and joins
1.0.1 (2012-10-13)
Fix error loading JSON storage, simplejson already yields unicode strings
Possibly use the original title of the script as description, if not explicitly set
More precise error on unknown script reference
Minor corrections
1.0 (2012-10-10)
Added JSON support for the on disk scripts storage
Adapted to work with SQLAlchemy 0.7.x
Updated to work with docutils > 0.8
Refactored as a Sphinx domain
Renamed the status table from prst_applied_info to simply patchdb
Renamed prst_patch_storage configuration setting to patchdb_storage
Each script ID is now lower case, to avoid ambiguities
0.3 (2010-11-14)
Updated to work with Sphinx 1.0
New :script: role for cross-references
New :file: option on script directive, to keep the actual text in an external file
0.2 (2010-03-03)
Compatibility with SQLAlchemy 0.6
New patchdb command line tool
0.1 (2009-10-28)
Replace home brew solution with SQLAlchemy topological sort
Use YAML for the persistent storage
Mostly working Sphinx adaptor
Rudimentary and mostly untested SQLAlchemy backend (basically only the direct PostgresQL backend has been battle tested in production…)
First standalone version
0.0
still a PylGAM side-product
simply a set of docutils directives
started with Firebird in mind, but grown up with PostgresQL
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Hashes for metapensiero.sphinx.patchdb-1.4.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 77982ebc80381a71349949b1930a9a5047fbbf29a0118ab1eddf7f81346d0512 |
|
MD5 | 08698336506552c6cf72874e6a0705f1 |
|
BLAKE2b-256 | 5ff64f85cb48e08f92df564130a8f30522b01e4ade9b803103dd883dd52c1021 |