Skip to main content

Python CLI tool for ClickHouse schema migrations with distributed locking and cluster support.

Project description

PyClickHouseMigrator

PyClickHouseMigrator

CI PyPI Python codecov Downloads

Lightweight Python tool for managing ClickHouse schema migrations. Minimal dependencies, no ORM.

Features: checksum validation, dry-run mode, cluster support (ON CLUSTER DDL, replicated service tables), rollback, migration status dashboard, auto-retry on connection failure, concurrent execution protection.

Install

pip install py-clickhouse-migrator

Quick Start

export CLICKHOUSE_MIGRATE_URL=clickhouse://default@localhost:9000/mydb

migrator init                    # create migrations directory
migrator new create_users_table  # create migration file
migrator up                      # apply pending migrations
migrator show                    # check status

init and new work offline — no ClickHouse connection required.

Migration Format

def up() -> str:
    return """
    CREATE TABLE IF NOT EXISTS users (
        id UInt64,
        name String,
        created_at DateTime DEFAULT now()
    ) ENGINE = MergeTree()
    ORDER BY id
    """


def rollback() -> str:
    return """
    DROP TABLE IF EXISTS users
    """

Each migration is a Python file with up() and rollback() functions that return SQL strings. Multiple statements can be separated by ;.

Commands

init

Create the migrations directory (default ./db/migrations).

migrator init
migrator --path ./my/migrations init

new

Create a new timestamped migration file.

migrator new create_users_table

Name is optional. Only letters, digits, and underscores allowed.

up

Apply pending migrations.

migrator up          # apply all pending
migrator up 3        # apply next 3
Option Default Description
N all Number of migrations to apply
--lock / --no-lock --lock Enable/disable distributed lock
--lock-ttl 600 Lock TTL in seconds
--lock-retry 3 Lock acquire retry attempts
--dry-run off Print SQL without executing
--allow-dirty off Skip checksum validation

Example output:

20250318090000_create_users.py applied [✔]
20250319120000_create_events.py applied [✔]

rollback

Rollback applied migrations in reverse order.

migrator rollback        # rollback last 1
migrator rollback 3      # rollback last 3
Option Default Description
N 1 Number of migrations to rollback
--lock / --no-lock --lock Enable/disable distributed lock
--lock-ttl 600 Lock TTL in seconds
--lock-retry 3 Lock acquire retry attempts
--dry-run off Print SQL without executing

Example output:

20250319120000_create_events.py rolled back [✔].

show

Display migration status, integrity information, and HEAD pointer.

migrator show        # last 5 applied + all pending
migrator show --all  # all applied + all pending
Option Default Description
--all off Show all applied migrations (default: last 5)

Example output:

Applied:
  [X] 20250320143022_add_indexes.py (HEAD)
  [X] 20250319120000_create_events.py
  [X] 20250318090000_create_users.py

Pending:
  [ ] 20250321100000_add_status_column.py

Applied: 3 | Pending: 1

Modified or missing migration files are flagged with (modified) or (missing) next to the name.

repair

Update stored checksums in db_migrations to match current migration files. Use after intentionally editing an already-applied migration.

migrator repair

force-unlock

Manually release a stuck migration lock. Use when a deployment crashed mid-migration and the lock wasn't released.

migrator force-unlock

lock-info

Show current lock holder and expiration time.

migrator lock-info

Configuration

All global options can be set via environment variables:

Option Env Variable Default Description
--url CLICKHOUSE_MIGRATE_URL ClickHouse connection URL
--path CLICKHOUSE_MIGRATE_DIR ./db/migrations Migrations directory
--cluster CLICKHOUSE_MIGRATE_CLUSTER Cluster name for ON CLUSTER DDL
--connect-retries CLICKHOUSE_MIGRATE_CONNECT_RETRIES 0 Connection retry attempts
--connect-retries-interval CLICKHOUSE_MIGRATE_CONNECT_RETRIES_INTERVAL 1 Seconds between retries
--send-receive-timeout CLICKHOUSE_MIGRATE_SEND_RECEIVE_TIMEOUT 600 Query timeout in seconds
-v, --verbose off Enable DEBUG logging
-q, --quiet off Suppress all output except errors

Docker

docker pull maksimburtsev/py-clickhouse-migrator
docker run --rm \
  -v ./migrations:/migrations \
  -e CLICKHOUSE_MIGRATE_URL=clickhouse://default@clickhouse:9000/mydb \
  maksimburtsev/py-clickhouse-migrator:1 \
  up

Mount your migrations directory to /migrations inside the container.

Pin to a major version tag (:1) or an exact version (:1.0.0).

Locking

When multiple processes run migrator up simultaneously, the advisory lock prevents double-applying migrations. Enabled by default on up and rollback.

The lock uses a dedicated table with TTL-based expiration (default 600 seconds) and automatic retry (default 3 attempts). If you increase --send-receive-timeout, increase --lock-ttl accordingly.

If a deployment crashes mid-migration and the lock isn't released, use lock-info to inspect and force-unlock to release it manually. Locks also expire automatically after the TTL.

migrator up --no-lock              # disable locking
migrator up --lock-ttl 600         # 10 minute TTL
migrator up --lock-retry 5         # 5 acquire attempts

Checksum Validation

After a migration is applied, its SHA-256 file hash is stored in db_migrations. On subsequent up runs, stored hashes are compared with current files. If someone edited an already-applied migration, the tool fails — because the database state no longer matches what the migration file describes.

--allow-dirty skips the check for a single run (e.g. you fixed a typo in a comment). repair updates all stored hashes to match current files. show displays integrity status per migration — ok, modified, or missing.

Cluster Support

When --cluster is set, the migrator creates its own service tables (db_migrations, _migrations_lock) with ON CLUSTER and replicated engines. Your migration SQL is used as-is — if you need ON CLUSTER in your DDL, include it in the migration yourself.

export CLICKHOUSE_MIGRATE_CLUSTER=my_cluster
migrator up

Python API

from py_clickhouse_migrator import Migrator

migrator = Migrator(
    database_url="clickhouse://default@localhost:9000/mydb",
    migrations_dir="./db/migrations",
)
migrator.up()

Known Limitations

SQL splitting by ; — migration SQL is split into statements by ;. Semicolons inside string literals will break parsing. If you need a literal ; in a value, use a separate migration or encode the value differently.

No DDL transactions — if a migration with multiple statements fails halfway, some statements will have been applied. Always use IF NOT EXISTS / IF EXISTS to make migrations idempotent and safe to re-run.

Advisory lock — the locking mechanism is best-effort, not a true distributed mutex. There is a race condition window of a few milliseconds between INSERT and verification where two processes could both acquire the lock. If you need stronger guarantees, run migrations from a single process (e.g. a Kubernetes Job or a CI/CD pipeline step).

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_clickhouse_migrator-1.1.0.tar.gz (1.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

py_clickhouse_migrator-1.1.0-py3-none-any.whl (16.2 kB view details)

Uploaded Python 3

File details

Details for the file py_clickhouse_migrator-1.1.0.tar.gz.

File metadata

  • Download URL: py_clickhouse_migrator-1.1.0.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for py_clickhouse_migrator-1.1.0.tar.gz
Algorithm Hash digest
SHA256 328be99da9151ad9d39816dcd5b6b75bfad78ba6069e01489fee2ff5643b5df7
MD5 e5263d4fd66c9cbbffe443c3c7f9ee3d
BLAKE2b-256 f2b5e4f41d1213ed3aa88716757dc84f6c79e3c5a0a9c2198f27b855eafbd10d

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_clickhouse_migrator-1.1.0.tar.gz:

Publisher: release.yml on Maksim-Burtsev/PyClickHouseMigrator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file py_clickhouse_migrator-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for py_clickhouse_migrator-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f7e5207035e921ed2e2718c9d6d4cc5965983e3b63e30f12ba3742653464e17b
MD5 068ea466d34d5a912ad375f708079f78
BLAKE2b-256 1d9e667fb7f6e1e5302db1160ddf84341334eaeaa26bfd9df659d30ce432742f

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_clickhouse_migrator-1.1.0-py3-none-any.whl:

Publisher: release.yml on Maksim-Burtsev/PyClickHouseMigrator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page