Declarative, automatic db migrations using SQL & git as the source of truth

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

automigrate

Tool to diff SQL schemas in git and apply the migrations.

Use this if you don't like to manage migrations separately from your declarative schema definitions.

Warning this is beta software
Features
Installation & basic use
What's happening under the covers
What does & doesn't work
Comparison vs other tools
Using with ORMs
Integrating with CI

Beta software

This is beta software and you should be careful with its output.

Eyeball the migrations before applying them
Proactively raise github issues for things that seem broken

Features

Operates on *.sql files (i.e. files with create table and create index statements)
Operates on git -- meaning that it tracks the git version of applied migration and can create a SQL migration given two git refs
Outputs migrations as SQL
No need to write or store migrations for simple cases -- they're defined bidirectionally in terms of your git history
Ability to specify migrations manually when needed
Stores the history of applied migrations in sql in automigrate_meta table

Philosophy

Defining your schema in your ORM is nuts because it ties you to one language, reduces clarity, and sometimes limits SQL features you can use
Existing migration tools don't pull their weight
SQL is a more general skill than ORMs and other tools should therefore mirror SQL
Mirroring live databases to get a schema is insane because are you tunneling to prod to run your linter? Live DB shouldn't be available to developers. Source of truth should be git.
Schema should be versioned using the same git shas as code so the logic is easy to detect if a deploy requires a migration

Installation & basic use

To install, pip install automig (or use pip3 if that fails). This should install dependencies and register the automig command.

You can fall back to git+https://github.com/abe-winter/automigrate if you want latest master.

Typical invocations:

# create an initial migration (also create meta tables)
automig 218dd2c 'test/schema/*.sql' --initial | psql -h 172.17.0.2 -U postgres --single-transaction

# migrate a database
LAST_SHA=$(psql -h 172.17.0.2 -U postgres -t -c "select sha from automigrate_meta order by id desc limit 1")
echo migrating from $LAST_SHA
automig $LAST_SHA...b5b40ce 'test/schema/*.sql' | psql -h 172.17.0.2 -U postgres --single-transaction

# I guess you can just migrate to HEAD if you're feeling lucky
automig $LAST_SHA...HEAD 'test/schema/*.sql' | psql -h 172.17.0.2 -U postgres --single-transaction

What's happening under the covers

Nothing fancy. When you run automig 218dd2c...b5b40ce 'test/schema/*.sql' (these are real SHAs in this git repo and will work if you clone the repo), it outputs:

-- changeset created from Namespace(glob='test/schema/*.sql', initial=False, ref='218dd2c...b5b40ce') at 2019-09-14 22:15:55.421146
-- changes for 9dcbd4e.t1
alter table t1 add column b int;
-- changes for b5b40ce.t1
create index t1a on t1 (a);
-- changes for b5b40ce.t2
create table t2 (a int primary key);
insert into automigrate_meta (sha) values ('9dcbd4e81e9a0dd7629ed7ae82a86891a88f76f3');
insert into automigrate_meta (sha) values ('b5b40ce718ea7241fee8d0a3826f244d21bf413c');

What does & doesn't work

Adding tables, indexes and columns should mostly work
drop column works
modifying columns partially works, supports changes to types, defaults, nullable. Read the diff_column() function for up-to-date information and file bugs for specific holes.
modifying primary keys doesn't work
For diffs that are erroring, you can override with a .manualmig.yml file
I think ... syntax is actually .. -- in cases where this matters (master...branch, or any A...B where A isn't the common ancestor of A & B), automig will be wrong. This needs to be fixed before 0.1
Be careful with using unescaped keywords as names (i.e. a table named table) -- you'll likely confuse the parser even where your sql engine allows it
This hasn't been tested on a wide range of syntax (i.e. arrays / json)
Not sure if capitalized SQL keywords are supported (todo add tests)
Need a way to check live schema against desired to call out problems
undo, i.e. what would be 'down' in a typical migration tool.
- This may work out of the box (pass HEAD...HEAD~1 instead of HEAD~1...HEAD), but needs tests
- up/down sections in .manualmig.yml
documentation for:
- writing schema files
- creating an initial migration
- checklist for running migrations: determining last sha, inspecting migration, running migration (postgres / mysql)
- resolving a rebase
- using manualmig when the tool is confused
automigrate_meta table
- column for automig VERSION string
- column for from_sha
Anything that messes with the git history (like a rebase) is deeply confusing to this tool and will result in bad migrations. Workaround:
- warning: this method only works if the rebase doesn't change migrations
- figure out the new sha that corresponds to your last old sha -- most likely you can do a git show $OLDSHA and then look for that commit msg in git log
- and insert that corresponding sha into the automigrate_meta table: psql -h 172.17.0.2 -U postgres -c "insert into automigrate (sha) values ('$NEWSHA')"
- you should be good to go
- todo: find a way to automatically detect & recover from rebases
- todo: provide an --opaque argument that doesn't try to create granular changes for each commit in the history
I think you have to explicitly name your indexes (don't rely on the auto-generated DB ones)

Comparison vs other tools

prisma lift has a declarative schema which isn't sql but looks kind of like it, and depends on a runtime DB to generate the migration diff, but seems like a neat tool
alembic, as far as I can tell, requires you to generate a skeleton python file for each change then fill it in yourself
sqlite sqldiff.exe can diff schemas but operates on full sqlite databases and I'm not sure if it outputs DDL
liquibase might have a diffing system but from the docs it looks like it's outputting XML. And they advise you not to use it
redgate sql compare seems to support comparing 'create table' schemas across git versions, although it looks like you have to find the SHAs by hand in a GUI
migra seems to require two live DBs to run a diff (one with current schema, one with new)
sqitch seems to require manually specifying deploy sql (and optionally revert & verify)

Using with ORMs

Your ORM has to be willing to import a schema from create table statements. (I don't know any ORM that does this out of the box, although some can reflect a live DB, like sqlalchemy's automap).

This repo contains a barebones, mostly untested harness to generate sqlalchemy models from create table statements. You can run it with:

python -m automig.lib.sa_harness 'test/schema/*.sql'

Happy to accept PRs to generate ORM defs from create table stmts (or vice versa).

Integrating with CI

This is how I would apply migrations in CI if I had a few weeks off to build tooling. If you try this & it works, let me know.

Flag the migration in your code review tool before merging the change to master
- Ideally the CR tool would run automig against master branch and show automig's output so the reviewer can sign off
- One way to do this: use something like the only: merge_requests flag in gitlab to run a master...$CI_COMMIT_SHORT_SHA diff (but warning, ... doesn't have common ancestor semantics yet)
- In lieu of complex CR improvements, you can run automig in your CI testing tool
Detect that a migration needs to be applied when deploying
- This means that your deploy bot needs to have access to your prod DB
- For continuous automatic deployment setups, you may want to block auto deploy for and require explicit signoff for deploys which require a migration
- Alert / page in the appropriate place when a deploy is held because it has a migration
Apply the migration from deploy CI
- To be safe, this requires an exactly-once semantic
- Unlock the automatic deployment when the migration has successfully applied
- Alert / page on error applying
- Some migrations may require downtime for part or all of your cluster -- these migrations probably won't be automatic
- Note that the order of new code & migration depends on the contents of the migration. This staffjoy post points out that with ORMs that select all columns, add column & drop column can both be done with 0 downtime, but drop column migrations must be applied after the new code is out

Development workflow

# enable .envrc with `direnv allow` if necessary, or set up your own virtualenv
# pip install -r requirements.txt
pip install pytest
pytest # in repo root

If you want to pitch in on the project, there are a bunch of @pytest.mark.skip tests that need to be filled in (most require feature development to get them passing).

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.1.8

Feb 27, 2021

0.1.7 yanked

Feb 27, 2021

Reason this release was yanked:

missing semicolon in 'alter type', syntax error

0.1.6

Feb 27, 2021

0.1.5

Nov 13, 2020

0.1.4

Oct 21, 2020

0.1.3

Oct 21, 2020

0.1.2

Jun 6, 2020

0.1.1

May 15, 2020

0.1.0

Mar 5, 2020

0.0.21

Mar 4, 2020

0.0.20

Feb 3, 2020

0.0.19

Jan 29, 2020

0.0.18

Jan 19, 2020

0.0.17

Dec 28, 2019

0.0.16

Dec 24, 2019

0.0.15

Dec 6, 2019

0.0.14

Oct 23, 2019

0.0.13

Oct 23, 2019

0.0.12

Oct 10, 2019

0.0.11

Sep 27, 2019

This version

0.0.10

Sep 26, 2019

0.0.9

Sep 17, 2019

0.0.8

Sep 15, 2019

0.0.7

Sep 14, 2019

0.0.6

Sep 14, 2019

0.0.5

Sep 12, 2019

0.0.4

Sep 9, 2019

0.0.3

Sep 9, 2019

0.0.1

Sep 4, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

automig-0.0.10.tar.gz (15.3 kB view hashes)

Uploaded Sep 26, 2019 Source

Hashes for automig-0.0.10.tar.gz

Hashes for automig-0.0.10.tar.gz
Algorithm	Hash digest
SHA256	`b5f7b79cc8a3d2d7a642e09da6987bc8b3a8c0629102594676677de6cd2737b1`
MD5	`ecb2042327ae05d3ff3bffdd0dbe4b17`
BLAKE2b-256	`b099ece4e87148228cbc539ae5e61874281d34ee59495f78b93b31c5bdde5d9c`