PostgreSQL schema differ and DDL generator
Project description
pgdelta
A PostgreSQL schema differ and DDL generator that produces high-fidelity schema migrations.
Development Status
pgdelta is currently in early development (v0.1.0).
Feature Support Matrix
Schemas
- ✅ CREATE SCHEMA - Basic schema creation
- ✅ DROP SCHEMA - Schema deletion
- ❌ ALTER SCHEMA - Schema modifications (planned)
- ❌ Owner to (planned)
- ✅ Rename (not applicable)
Tables
- ✅ CREATE TABLE - Basic table creation
- ✅ Column definitions with data types
- ✅ NOT NULL constraints
- ✅ DEFAULT expressions
- ✅ Generated columns (GENERATED ALWAYS AS)
- ✅ Table inheritance (INHERITS)
- ✅ Storage parameters (WITH clause)
- ❌ Column STORAGE/COMPRESSION settings (not planned)
- ❌ Column COLLATE settings (not planned)
- ❌ LIKE clause (not planned)
- ❌ PARTITION BY clause (not planned)
- ❌ TABLESPACE clause (not planned)
- ❌ TEMPORARY/UNLOGGED tables (not applicable)
- ✅ DROP TABLE - Table deletion
- ✅ ALTER TABLE - Table modifications (partial)
- ✅ ADD COLUMN (with NOT NULL, DEFAULT)
- ✅ DROP COLUMN
- ✅ ALTER COLUMN TYPE (with USING expression)
- ✅ ALTER COLUMN SET/DROP DEFAULT
- ✅ ALTER COLUMN SET/DROP NOT NULL
- ❌ Table/column renaming (not planned - uses drop/recreate)
- ❌ RENAME TO (not planned - uses drop/recreate)
- ❌ SET SCHEMA (not planned - uses drop/recreate)
Constraints
- ✅ Primary Keys - CREATE constraint
- ✅ Unique Constraints - CREATE constraint
- ✅ Multi-column unique constraints
- ✅ Partial unique constraints (WHERE clause)
- ✅ Check Constraints - CREATE constraint
- ✅ Foreign Keys - CREATE constraint
- ✅ Multi-column foreign keys
- ✅ ON DELETE/UPDATE actions (CASCADE, RESTRICT, SET NULL, SET DEFAULT)
- ✅ Constraint deferrability options
- ✅ Exclusion Constraints - CREATE constraint (basic)
- ✅ DROP CONSTRAINT - Constraint deletion
- ❌ ALTER CONSTRAINT - Constraint modifications (planned)
- ❌ VALIDATE CONSTRAINT - Constraint validation (planned)
Indexes
- ✅ CREATE INDEX - Complete index creation
- ✅ Unique indexes
- ✅ Partial indexes (WHERE clause)
- ✅ Functional indexes (expressions)
- ✅ Multi-column indexes
- ✅ All index methods (btree, hash, gin, gist, etc.)
- ✅ Custom operator classes
- ✅ ASC/DESC ordering
- ✅ NULLS FIRST/LAST
- ❌ CONCURRENTLY option (not applicable)
- ✅ DROP INDEX - Index deletion
- ❌ ALTER INDEX - Index modifications (planned)
- ❌ REINDEX - Index rebuilding (not applicable)
Views
- ✅ CREATE VIEW - Basic view creation
- ✅ Schema-qualified names
- ✅ View definition (AS query)
- ❌ RECURSIVE views (planned)
- ❌ Explicit column names (planned)
- ❌ WITH CHECK OPTION (planned)
- ✅ DROP VIEW - View deletion
- ✅ CREATE OR REPLACE VIEW - View replacement
- ❌ ALTER VIEW - View modifications (planned)
Materialized Views
- ✅ CREATE MATERIALIZED VIEW - Materialized view creation
- ✅ DROP MATERIALIZED VIEW - Materialized view deletion
- ❌ ALTER MATERIALIZED VIEW - Materialized view modifications (planned)
- ❌ REFRESH MATERIALIZED VIEW - Not applicable for DDL
Functions & Procedures
- ✅ CREATE FUNCTION - Function creation
- ✅ CREATE PROCEDURE - Procedure creation
- ✅ DROP FUNCTION - Function deletion
- ✅ DROP PROCEDURE - Procedure deletion
- ✅ CREATE OR REPLACE FUNCTION - Function replacement
- ❌ ALTER FUNCTION - Function modifications (planned)
- ❌ ALTER PROCEDURE - Procedure modifications (planned)
Triggers
- ✅ CREATE TRIGGER - Trigger creation
- ✅ DROP TRIGGER - Trigger deletion
- ❌ ALTER TRIGGER - Trigger modifications (planned)
- ❌ ENABLE/DISABLE TRIGGER - Not applicable for DDL
Sequences
- ✅ CREATE SEQUENCE - Sequence creation
- ✅ DROP SEQUENCE - Sequence deletion
- ✅ ALTER SEQUENCE OWNED BY - Sequence ownership
- ❌ ALTER SEQUENCE - Sequence modifications (planned)
Types & Domains
- ✅ CREATE TYPE - Custom type creation (enums, composites)
- ✅ DROP TYPE - Type deletion
- ❌ CREATE DOMAIN - Domain creation (planned)
- ❌ DROP DOMAIN - Domain deletion (planned)
- ❌ ALTER TYPE - Type modifications (planned)
- ❌ ALTER DOMAIN - Domain modifications (planned)
Security & Access Control
- ✅ Row Level Security - RLS policies
- ✅ CREATE POLICY - Policy creation
- ✅ DROP POLICY - Policy deletion
- ✅ ALTER POLICY - Policy modifications
- 🚫 CREATE ROLE - Role creation (environment-specific)
- 🚫 GRANT/REVOKE - Privilege management (environment-specific)
- 🚫 ALTER DEFAULT PRIVILEGES - Default privilege management (environment-specific)
Other Features
- ❌ Comments - Object comments (not planned)
- ❌ Event Triggers - Event trigger support (not planned)
- ❌ Extensions - Extension management (not planned)
- ✅ Dependency Resolution - Automatic DDL ordering
- ✅ Roundtrip Fidelity - Extract → Diff → Generate → Apply cycles
Note: Extensions are not supported because they are environment-specific and require installation on the target database. pgdelta focuses on portable schema definitions that can be applied across different PostgreSQL environments.
The project focuses on schema structure diffing and DDL generation with comprehensive support for PostgreSQL objects including tables, constraints, indexes, views, functions, triggers, sequences, types, and RLS policies.
Architecture
pgdelta uses a three-phase approach designed for correctness and testability:
Phase 1: Extract
- SQL-only access: Database connections used exclusively during extraction
- Immutable snapshots: One-time catalog extraction into frozen dataclasses
- Field metadata: Distinguishes identity, data, and internal fields for semantic comparison
Phase 2: Diff
- Semantic comparison: Uses field metadata to compare objects based on identity and data fields
- Change detection: Identifies create, drop, and alter operations
- Pure comparison: No database access, operates on immutable snapshots
Phase 3: Generate
- Pure functions: SQL generation from change objects with no side effects
- Deterministic output: Same input always produces identical DDL
- Type-safe: Complete mypy coverage with structural pattern matching
- Dependency resolution: Constraint-based dependency ordering using NetworkX
Testing Strategy
- Roundtrip fidelity: Generic integration tests that verify
Extract(DB) → Diff → Generate(SQL) → Apply(SQL) → Extract(DB)produces semantically identical catalogs - Real PostgreSQL: All tests use actual PostgreSQL instances via testcontainers
Technical Decisions
- Pure Functions: All core logic uses pure functions with no side effects
- Immutable Data: Extract once, operate on immutable snapshots
- Dependency Resolution: Constraint-based dependency ordering using NetworkX
- Type Safety: Complete type safety with mypy and structural pattern matching
- Roundtrip Fidelity: Generates DDL that recreates schemas exactly
Installation
Note: pgdelta is not yet published to PyPI. Install from source:
git clone https://github.com/olirice/pgdelta.git
cd pgdelta
pip install -e ".[dev]"
Usage
Python API
from pgdelta import PgCatalog, generate_sql
from pgdelta.catalog import extract_catalog
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
# Connect to databases
source_engine = create_engine("postgresql://user:pass@localhost/source_db")
target_engine = create_engine("postgresql://user:pass@localhost/target_db")
with Session(source_engine) as source_session, Session(target_engine) as target_session:
# Extract schemas
source_catalog = extract_catalog(source_session)
target_catalog = extract_catalog(target_session)
# Generate migration from target to source
changes = target_catalog.diff(source_catalog)
# Generate SQL statements
sql_statements = [generate_sql(change) for change in changes]
for sql in sql_statements:
print(sql)
Example Output
CREATE SCHEMA "analytics";
CREATE TABLE "analytics"."user_stats" (
"user_id" integer,
"post_count" integer DEFAULT 0,
"last_login" timestamp without time zone
);
Development Setup
Prerequisites
- Python 3.13+
- Docker (for running PostgreSQL test containers)
Setup Instructions
-
Clone the repository
git clone https://github.com/olirice/pgdelta.git cd pgdelta
-
Create and activate a virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install in editable mode with development dependencies
pip install -e ".[dev]"
-
Install pre-commit hooks
pre-commit install
Running Tests
The project uses pytest with real PostgreSQL databases via testcontainers:
# Run all tests
pytest
# Run tests in parallel (faster)
pytest -n auto
# Run specific test categories
pytest tests/unit/ # Unit tests only
pytest tests/integration/ # Integration tests only
# Run tests with coverage
pytest --cov=src/pgdelta --cov-report=html
# Run a specific test
pytest tests/unit/test_sql_generation.py::test_create_schema_basic
Development Tools
# Type checking
mypy src/pgdelta
# Linting and formatting
ruff check
ruff format
# Run all pre-commit hooks
pre-commit run --all-files
Test Requirements
- Docker: Required for PostgreSQL test containers
- PostgreSQL 17: Automatically managed via testcontainers
- Real Database Testing: All tests use real PostgreSQL instances, not mocks
CI/CD
The project includes comprehensive GitHub Actions workflows:
- CI Pipeline (
ci.yml): Runs pre-commit checks and tests on every push/PR - Security Scanning (
security.yml): Dependency and security analysis - Automated Releases (
release.yml): Builds and publishes to PyPI on tag push
All workflows use the latest action versions and follow security best practices with minimal permissions.
Architecture Details
Model Design
PostgreSQL catalog models are simplified and optimized for DDL generation:
- Immutable dataclasses: All models use
@dataclass(frozen=True)for immutability - Essential fields only: Only fields necessary for DDL generation are included
- Stable identifiers: Cross-database portable identifiers using stable_id (no pg_depend_id required)
- Type safety: Complete type annotations with mypy compliance
Field Metadata System
Uses dataclass field metadata to categorize fields with wrapper functions:
identity(): Fields that identify the object (used in semantic comparison)data(): Fields that represent object data (used in semantic comparison)internal(): Fields needed for dependency resolution (ignored in semantic comparison)
The wrapper functions generate the appropriate metadata dictionaries, making field categorization cleaner and more maintainable.
Roadmap
Phase 1 (Current - v0.1.x)
- ✅ Comprehensive schema and table DDL generation
- ✅ All constraint types (primary keys, foreign keys, unique, check, exclusion)
- ✅ Complete index support (all types, partial, functional)
- ✅ Views and materialized views
- ✅ Functions and triggers
- ✅ Sequences with ownership tracking
- ✅ Custom types (enums, composites)
- ✅ RLS policies
- ✅ Advanced dependency resolution
- ✅ Roundtrip fidelity
- ✅ CLI interface
Phase 2 (v0.2.x)
- 🔄 ALTER operations for constraints and indexes
- 🔄 Domain types
- 🔄 Enhanced materialized view support
- 🔄 Advanced function features
Phase 3 (v0.3.x)
- 🔄 Partitioning support
- 🔄 Performance optimizations
- 🔄 Streaming processing for large schemas
License
Apache 2.0 - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pgdelta-0.1.0.tar.gz.
File metadata
- Download URL: pgdelta-0.1.0.tar.gz
- Upload date:
- Size: 115.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f94a44b1ca0dedeb595182edea0b48fbf4db08fc6c96987cadc4b416d6f862b2
|
|
| MD5 |
25966f677ce73ddeb0484f17a740c6bd
|
|
| BLAKE2b-256 |
689208e8560440054c1558bf623c0719622aa92ba2fff22cf871a05b98c1e312
|
File details
Details for the file pgdelta-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pgdelta-0.1.0-py3-none-any.whl
- Upload date:
- Size: 96.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
74175f0ec26a5818a735ef8648cc9ea932d8774d4ff7bfb242f674d0ad39d065
|
|
| MD5 |
e5b985d9700c95c0d8cb5a720ffbd9c7
|
|
| BLAKE2b-256 |
03ed1e20e8e21ff6b45e179800cae34981aae00e2a4ca52f39c364e7ad4d4d18
|