Skip to main content

django migrations without long locks

Project description

PyPI Version Supported Python versions Build Status

Zero-Downtime-Migrations

Description

Zero-Downtime-Migrations (ZDM) – this is application which allow you to avoid long locks (and rewriting the whole table) while applying Django migrations using PostgreSql as database.

Current possibilities

  • add field with default value (nullable or not)

  • create index concurrently (and check index status after creation in case it was created with INVALID status)

  • add unique property to existing field through creating unique index concurrently and creating constraint using this index

Why use it

We face such a problem - performing some django migrations (such as add column with default value) lock the table on read/write, so its impossible for our services to work properly during this periods. It can be acceptable on rather small tables (less than million rows), but even on them it can be painful if service is high loaded. But we have a lot of tables with more than 50 millions rows, and applying migrations on such a table lock it for more than an hour, which is totally unacceptable. Also, during this time consuming operations, migration rather often fail because of different errors (such as TimeoutError) and we have to start it from scratch or run sql manually thought psql and when fake migration.

So in the end we have an idea of writing this package so it can prevent long locks on table and also provide more stable migration process which can be continued if operation fall for some reason.

Installation

To install ZDM, simply run:

pip install zero-downtime-migrations

Usage

If you are currently using default postresql backend change it to:

DATABASES = {
     'default': {
         'ENGINE': 'zero_downtime_migrations.backend',
         ...
     }
     ...
 }

If you are using your own custom backend you can:

  • Set SchemaEditorClass if you are currently using default one:

from zero_downtime_migrations.backend.schema import DatabaseSchemaEditor

class DatabaseWrapper(BaseWrapper):
    SchemaEditorClass = DatabaseSchemaEditor
  • Add ZeroDownTimeMixin to base classes of your DatabaseSchemaEditor if you are using custom one:

from zero_downtime_migrations.backend.schema import ZeroDownTimeMixin

class YourCustomSchemaEditor(ZeroDownTimeMixin, ...):
    ...

Note about indexes

Library will always force CONCURRENTLY index creation and after that check index status - if index was created with INVALID status it will be deleted and error will be raised. In this case you should fix problem if needed and restart migration. For example if creating unique index was failed you should make sure that there are only unique values in column on which index is creating. Usually index creating with invalid status due to deadlock so you need just restart migration.

Example

When adding not null column with default django will perform such sql query:

ALTER TABLE "test" ADD COLUMN "field" boolean DEFAULT True NOT NULL;

Which cause postgres to rewrite the whole table and when swap it with existing one (note from django documentation) and during this period it will hold exclusive lock on write/read on this table.

This package will break sql above in separate commands not only to prevent the rewriting of whole table but also to add column with as small lock times as possible.

First of all we will add nullable column without default and add default value to it in separate command in one transaction:

ALTER TABLE "test" ADD COLUMN "field" boolean NULL;
ALTER TABLE "test" ALTER COLUMN "field" SET DEFAULT true;

This will add default for all new rows in table but all existing ones will be with null value in this column for now, this operation will be quick because postgres doesn’t have to fill all existing rows with default.

Next we will count objects in table and if result if more than zero - calculate the size of batch in witch we will update existing rows. After that while where are still objects with null in this column - we will update them.

While result of following statement is more than zero:

WITH cte AS (
SELECT <table_pk_column> as pk
FROM "test"
WHERE  "field" is null
LIMIT  <size_calculated_on_previous_step>
)
UPDATE "test" table_
SET "field" = true
FROM   cte
WHERE  table_.<table_pk_column> = cte.pk

When we have no more rows with null in this column we can set not null and drop default (which is django default behavior):

ALTER TABLE "test" ALTER COLUMN "field" SET NOT NULL;
ALTER TABLE "test" ALTER COLUMN "field" DROP DEFAULT;

So we finish add field process. It will be definitely more time consuming than basic variant with one sql statement, but in this approach there are no long locks on table so service can work normally during this migrations process.

Run tests

./run_tests.sh

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zero-downtime-migrations-0.12.tar.gz (16.6 kB view details)

Uploaded Source

File details

Details for the file zero-downtime-migrations-0.12.tar.gz.

File metadata

File hashes

Hashes for zero-downtime-migrations-0.12.tar.gz
Algorithm Hash digest
SHA256 c37a16e4f3b43a406580504c916f4ab49a563327a277ba80894e46e7a3844b85
MD5 a9cc946319f426cb1165bd9f8916ec58
BLAKE2b-256 428784e561006f5c2882e96b02e5e465dbd1ec5084e7136834875ebaba613380

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page