Skip to main content

Manage a baseline pg_dump for Django test databases — turn N-minute migrate boots into a few-second psql import.

Project description

django-pg-baseline

tests PyPI version Python versions Django versions License: MIT

Manage a baseline pg_dump for Django test databases — turn N-minute migrate boots into a few-second psql import.

A reusable Django app that manages a baseline.sql artifact (a pg_dump of the post-migrate schema + seed data) and loads it automatically whenever Django creates a test database. migrate then applies only the small delta of migrations added since the dump was taken.

In real projects with hundreds of migrations this turns a ~6-minute migrate into a ~3-second psql import — or sub-second when paired with a testcontainer that clones from a populated template DB.

Why

A Django suite with hundreds of migrations and/or non-trivial seed data spends many minutes per test run on migrate. The fix is well-known in principle:

  1. Apply migrations once against a clean PG.
  2. pg_dump the result.
  3. On every subsequent test run, psql -f (or CREATE DATABASE ... WITH TEMPLATE) the dump into the test DB, then let migrate apply only the small delta of migrations added since the dump was taken.

Every Django shop with a heavy migration history rediscovers this pattern independently. django-pg-baseline packages it as a reusable app with all the operational bits people forget the first time:

  • a deterministic, version-controlled baseline.sql (with timestamp freezing for diff stability),
  • a sidecar baseline.meta.json recording the highest migration name per app, plus git SHA and PG version,
  • automatic loading on test DB creation when no faster template-clone path is available,
  • explicit coordination with pytest-testcontainers-django for the template-clone path,
  • a one-shot manage.py baseline_rebuild that spins an isolated PG via testcontainers, runs migrate, and emits the dump.

Features

  • One-line setup: add "django_pg_baseline" to INSTALLED_APPS, set PG_BASELINE['BASELINE_DIR'], done.
  • Three usage modes: standalone (host psql), with testcontainers (template clone), or rebuild-only (CI cron).
  • Optional pytest plugin for projects that don't want to add the app to INSTALLED_APPS.
  • Deterministic dumps: built-in timestamp freezing produces byte-stable diffs across rebuilds.
  • Cross-major PG support: runs pg_dump inside the rebuild container to guarantee client/server version match; scrubs known PG17→PG16 incompatibilities.
  • Stale baseline is fine: if the dump lags behind HEAD, Django's migrate applies the delta on top. manage.py baseline_info shows per-app deltas; the package itself never gates on freshness.
  • psycopg v2 and v3 compatible. No runtime psycopg dep — uses whichever the host project already pulled in for Django's PG backend.

Installation

Using uv (recommended)

uv add django-pg-baseline

Using pip

pip install django-pg-baseline

This package depends on Django>=5.0 and testcontainers[postgres]. It does not declare a runtime psycopg dependency — your Django project already has either psycopg, psycopg-binary, psycopg2, or psycopg2-binary installed (Django's PG backend requires one), and forcing a flavor would conflict with that choice.

Quick start

1. Configure

# settings.py
INSTALLED_APPS = [
    ...,
    "django_pg_baseline",
]

PG_BASELINE = {
    "BASELINE_DIR": BASE_DIR / "baseline-sql",
}

The directory should be tracked in git — it holds baseline.sql and baseline.meta.json, both produced by the baseline_rebuild command.

2. Generate the baseline

python manage.py baseline_rebuild
git add baseline-sql/baseline.sql baseline-sql/baseline.meta.json
git commit -m "chore(baseline): refresh after migrations"

This spins a fresh Postgres testcontainer, runs migrate, freezes configured timestamp columns, runs pg_dump inside the container, scrubs PG-version-specific lines, and writes the dump + meta file.

3. Run tests

pytest

Django creates the test DB; the monkey patch loads baseline.sql via psql; migrate applies any post-baseline delta. That's it.

Three modes of use

Mode A — Standalone (host psql)

The simplest case. Useful when:

  • the consumer runs tests against a long-lived PG (host PG, a docker compose service, a CI service container),
  • psql is on PATH.

What happens at test time:

  1. AppConfig.ready() installs the _create_test_db patch.
  2. Django's runner calls _create_test_dbCREATE DATABASE test_<name>.
  3. Patch sees django_migrations is missing in the new DB → psql -f baseline.sql --single-transaction --quiet -v ON_ERROR_STOP=1.
  4. Django's migrate applies any post-baseline delta.

Note: if you use a TEMPLATE DB (set TEST.TEMPLATE in your DATABASES), the test DB user must be granted the pg_signal_backend role — Postgres needs zero connections on the source DB before CREATE DATABASE WITH TEMPLATE is allowed, and the patch terminates leftover sessions to enforce that.

Mode B — With pytest-testcontainers-django

Faster (sub-second test-DB creation via template clone). Useful when:

  • you accept Docker as a test dependency,
  • you want the test DB to be a clone of a populated template rather than a psql reload.

Setup is identical to Mode A. Once pytest-testcontainers-django is installed, it auto-detects this package via get_baseline_path(), mounts baseline.sql into the PG container as /docker-entrypoint-initdb.d/01-baseline.sql, and sets DATABASES['default']['TEST']['TEMPLATE'] so Django runs CREATE DATABASE … WITH TEMPLATE ….

In Mode B the host psql shell-out is never invoked. We still own:

  • the patch's "kick sessions off template" prelude,
  • settings.PG_BASELINE and get_baseline_path(),
  • manage.py baseline_rebuild.

Mode C — Build/rebuild the baseline (CI or local)

python manage.py baseline_rebuild
git add path/to/baseline-sql/
git commit -m "chore(baseline): refresh after migrations …"

Recommended downstream wiring: a GitHub Action that runs baseline_rebuild whenever **/migrations/** changes on the main branch and opens a PR with the refreshed dump. The package itself does not enforce any "freshness" policy — when to rebuild is the project's decision; we just provide the tooling.

Configuration reference

PG_BASELINE = {
    # REQUIRED. Directory holding baseline.sql + baseline.meta.json.
    "BASELINE_DIR": BASE_DIR / "baseline-sql",

    # Optional, defaults shown.
    "SQL_FILENAME": "baseline.sql",
    "META_FILENAME": "baseline.meta.json",

    # Which Django connection to load into / dump from.
    "DATABASE_ALIAS": "default",

    # Auto-install the _create_test_db monkey patch in
    # AppConfig.ready(). Set to False for manual control (e.g. only
    # under pytest, only on certain CI hosts).
    "AUTO_LOAD_ON_TEST_DB": True,

    # Image used by `baseline_rebuild`. Override for plpython3u,
    # custom locales, extensions, etc.
    "REBUILD_IMAGE": "postgres:16",

    # Extra args appended to the built-in pg_dump invocation. The
    # default invocation already includes --no-owner --no-acl
    # --no-privileges --no-comments and --exclude-table-data=django_session.
    "PG_DUMP_EXTRA_ARGS": ["--exclude-table-data=audit_log"],

    # Stacks ON TOP of the default exclusions. Each entry becomes
    # --exclude-table-data=<pattern>. Cleaner than spelling out
    # --exclude-table-data=... in PG_DUMP_EXTRA_ARGS.
    "PG_DUMP_EXTRA_EXCLUDE_TABLE_DATA": ["django_cache*", "easy_thumbnails_*"],

    # Tables/columns whose timestamps are frozen before pg_dump,
    # for deterministic diffs across rebuilds.
    "FREEZE_TIMESTAMPS": [("django_migrations", ["applied"])],
    "FREEZE_TIMESTAMPS_EXTRA": [("django_template", ["creation_date"])],
    "FREEZE_TIMESTAMP_VALUE": "2000-01-01 00:00:00+00",
}

Management commands

Command What it does
baseline_load Load baseline.sql into the configured DB. Skips when django_migrations already exists, unless --force.
baseline_info Human summary: git SHA, PG version, sql/meta paths, plus per-app deltas. Always exits 0.
baseline_rebuild Regenerate baseline.sql + baseline.meta.json. Spins a testcontainers PG, runs migrate, freezes timestamps, runs in-container pg_dump, scrubs, writes meta. Flags: --image, --baseline-dir.

Pytest plugin (alternative to INSTALLED_APPS)

If you'd rather not add the app to INSTALLED_APPS, the package ships a pytest plugin that installs the same monkey patch via pytest_configure:

# pyproject.toml — pytest auto-discovers the plugin via the
# pytest11 entry point. Nothing else needed.

Behaviour matches the INSTALLED_APPS route exactly:

  • no-op when DJANGO_SETTINGS_MODULE is unset,
  • no-op when PG_BASELINE is unset,
  • raises pytest.UsageError when BASELINE_DIR is configured but baseline.sql is missing (matching AppConfig.ready() policy — loud failure beats silent slowness in CI).

Use one route or the other, not both. (Both are idempotent; double install is safe but pointless.)

Public API

Stable from v0.1 (the contract surface for downstream tooling such as pytest-testcontainers-django):

from django_pg_baseline import get_baseline_path  # Path | None

Reachable via submodules but not yet locked under semver (stabilised at v1.0):

from django_pg_baseline.conf import get_config, BaselineConfig
from django_pg_baseline.patches import install_test_db_patch
from django_pg_baseline.loader import load_baseline, baseline_needed
from django_pg_baseline.freshness import check_freshness, FreshnessReport

Environment variables

Variable Effect
DJANGO_PG_BASELINE_SQL_PATH Override get_baseline_path() resolution. Points at a dump file directly, bypassing settings.PG_BASELINE['BASELINE_DIR']. Useful for CI pinning a specific baseline.

Security note

The dump captures all data present in the testcontainer after migrate(). If your data migrations seed users, fixtures, or any other content that ends up in the dump, that data lands in version control. Review the dump before committing, especially on the first rebuild. Use PG_DUMP_EXTRA_EXCLUDE_TABLE_DATA to skip tables whose row data should not ship (e.g. auth_user when you have real test passwords). The package does not exclude auth_user by default — projects that intentionally seed admin fixtures rely on that data being in the baseline.

Supported versions

Python

Python 3.10 3.11 3.12 3.13

Django × Python

Django 3.10 3.11 3.12 3.13 Status
5.0 EOL Apr 2025 — supported on a best-effort basis
5.1 EOL Dec 2025 — supported on a best-effort basis
5.2 LTS Active LTS (extended support to Apr 2028)

Django 4.2 is out of scope (LTS goes EOL in April 2026 — the project targets current Django).

PostgreSQL

PostgreSQL 16 and 17. Older PG versions (14, 15) are out of scope: they're already EOL on the ladder and would complicate _scrub_dump (the list of cross-major incompatibilities to scrub grows with every PG release we keep alive).

psycopg

psycopg2, psycopg2-binary, and psycopg[binary]>=3 all work — the package uses whichever your Django project already pulled in for its PG backend. CI tests both psycopg2-binary and psycopg[binary] in separate matrix cells.

Operating system

Linux is the supported CI target. macOS works in practice for local development. Windows is not supported — the package shells out to psql/pg_dump and assumes POSIX path conventions and a Linux-style Docker daemon for the rebuild path.

How it fits with related packages

django-pg-baseline is package #3 of the testcontainers-for-Django family:

  1. pytest-testcontainers — generic pytest plugin, session-scoped Docker container lifecycle. Framework-agnostic.
  2. pytest-testcontainers-django — Django bridge on top of #1. Injects env vars before Django imports settings; supports init-script mounts and DATABASES['default']['TEST']['TEMPLATE'] for fast test-DB clone.
  3. django-pg-baseline (this package) — manages the baseline.sql artifact and provides the patch / get_baseline_path() contract that #2 reads.

Each package can be used standalone. Pair #3 with #2 for the fastest test-DB creation; use #3 alone with a host psql if you prefer no Docker dependency.

Contributing

Issues and PRs welcome at https://github.com/iplweb/django-pg-baseline.

Local development:

git clone https://github.com/iplweb/django-pg-baseline
cd django-pg-baseline
uv sync --extra test
pre-commit install
pytest

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

django_pg_baseline-0.1.0.tar.gz (21.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

django_pg_baseline-0.1.0-py3-none-any.whl (27.9 kB view details)

Uploaded Python 3

File details

Details for the file django_pg_baseline-0.1.0.tar.gz.

File metadata

  • Download URL: django_pg_baseline-0.1.0.tar.gz
  • Upload date:
  • Size: 21.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.2

File hashes

Hashes for django_pg_baseline-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0ab75064a6595422462da3a0d6b8c6b3aa29552088e0fd8acd569367c2589a2d
MD5 b74d479201a02ae98578f6497192c926
BLAKE2b-256 47e5dba77ed25d2ec04148a8c1f1e8b400716d6934d7d18bb28416ef7ccc527b

See more details on using hashes here.

File details

Details for the file django_pg_baseline-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for django_pg_baseline-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 64ffdd711b90423e16a1e0e06bdc0dac25f05578db14dcea9a0bfd27fab2ae47
MD5 b8e2d33452107942d047005972de395e
BLAKE2b-256 d9ea8c70e5b635ce66ac09c3d6d19aca400ae6972d9259ac25dda324c9611b70

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page