Python CLI tool for ClickHouse schema migrations with distributed locking and cluster support.
Project description
PyClickHouseMigrator
SQL-first ClickHouse migrations for Python teams.
PyClickHouseMigrator is a small, predictable migration runner for ClickHouse. It keeps schema changes in plain .sql files, applies them in order, stores migration state inside ClickHouse, validates checksums, supports rollback SQL, and fits naturally into CI/CD.
No ORM. No schema diff engine. No framework. Just a clean migration flow for teams that want their ClickHouse DDL to live in Git.
Highlights
- Plain SQL migrations — migration files are
.sql, not Python modules. - Explicit statement blocks — each
-- @stmtblock is executed as one ClickHouse query; the migrator does not split SQL by semicolons. - Rollback SQL —
-- migrator:downstores the rollback SQL used bymigrator rollback. - Checksum validation — detects edited or missing applied migration files.
- Dry-run mode — preview pending migrations without writing migration state.
- Baseline — adopt an existing database without executing historical migrations.
- Advisory locking — protects common CI/CD concurrency cases.
- Cluster-aware service tables — migration metadata and lock tables can be created with
ON CLUSTERand replicated engines. - Small dependency set — depends on
clickandclickhouse-driver.
Install
pip install py-clickhouse-migrator
With uv:
uv add py-clickhouse-migrator
The CLI command is:
migrator --help
Quick start
Set the ClickHouse connection URL. The database in the URL must already exist.
export CLICKHOUSE_MIGRATE_URL=clickhouse://default@localhost:9000/mydb
Create a migrations directory and a first migration:
migrator init
migrator new create_users_table
Edit the generated file in ./db/migrations:
-- migrator:up
-- @stmt
CREATE TABLE IF NOT EXISTS users
(
id UInt64,
name String,
created_at DateTime DEFAULT now()
)
ENGINE = MergeTree
ORDER BY id
-- migrator:down
-- @stmt
DROP TABLE IF EXISTS users
Preview what will run:
migrator up --dry-run
Apply pending migrations:
migrator up
Check status:
migrator show
Rollback the last applied migration:
migrator rollback
init and new work offline. Commands that read or write migration state require a ClickHouse connection.
Migration format
A migration is a .sql file with two required sections:
-- migrator:up
-- @stmt
-- SQL to apply the migration
-- migrator:down
-- @stmt
-- SQL to roll it back
Rules:
- each file must contain exactly one
-- migrator:upsection and exactly one-- migrator:downsection; -- migrator:upmust appear before-- migrator:down;- SQL must be placed inside
-- @stmtblocks; - the
upsection must contain at least one non-empty statement block; - the
downsection may be empty; - each
-- @stmtblock is sent to ClickHouse as one query; - the migrator does not split SQL by
;.
This is intentional. It avoids fragile semicolon splitting and makes multi-statement migrations explicit.
Good:
-- migrator:up
-- @stmt
CREATE TABLE IF NOT EXISTS events
(
id UInt64,
message String
)
ENGINE = MergeTree
ORDER BY id
-- @stmt
ALTER TABLE events ADD COLUMN IF NOT EXISTS created_at DateTime DEFAULT now()
-- migrator:down
-- @stmt
ALTER TABLE events DROP COLUMN IF EXISTS created_at
-- @stmt
DROP TABLE IF EXISTS events
Avoid putting multiple ClickHouse queries into one block:
-- bad
-- @stmt
CREATE TABLE a (id UInt64) ENGINE = MergeTree ORDER BY id;
CREATE TABLE b (id UInt64) ENGINE = MergeTree ORDER BY id;
Use separate blocks instead:
-- good
-- @stmt
CREATE TABLE a (id UInt64) ENGINE = MergeTree ORDER BY id
-- @stmt
CREATE TABLE b (id UInt64) ENGINE = MergeTree ORDER BY id
See Migration format for more examples.
Commands
init
Create the migrations directory. Default: ./db/migrations.
migrator init
migrator --path ./migrations init
Works offline.
new
Create a timestamped .sql migration file.
migrator new create_users_table
Generated filename format:
YYYYMMDDHHmmss_create_users_table.sql
The name suffix is optional, but recommended. Only letters, digits, and underscores are allowed.
Works offline.
up
Apply pending migrations in filename order.
migrator up # apply all pending migrations
migrator up 3 # apply next 3 pending migrations
migrator up --dry-run
| Option | Default | Description |
|---|---|---|
N |
all | Optional positional limit: number of pending migrations to apply. |
--lock / --no-lock |
--lock |
Enable or disable the migration lock. |
--lock-ttl |
600 |
Lock TTL in seconds. |
--lock-retry |
3 |
Lock acquire retry attempts. |
--dry-run |
off | Print pending migration SQL without executing it. |
--validate / --no-validate |
--validate |
Enable or disable preflight validation with EXPLAIN AST. |
--allow-dirty |
off | Skip checksum mismatch failures for this run. |
Example output:
20260421140000_create_users_table.sql applied [✔]
20260421143000_add_events_table.sql applied [✔]
rollback
Rollback applied migrations in reverse order.
migrator rollback # rollback last applied migration
migrator rollback 3 # rollback last 3 applied migrations
migrator rollback --dry-run
| Option | Default | Description |
|---|---|---|
N |
1 |
Optional positional limit: number of migrations to rollback. |
--lock / --no-lock |
--lock |
Enable or disable the migration lock. |
--lock-ttl |
600 |
Lock TTL in seconds. |
--lock-retry |
3 |
Lock acquire retry attempts. |
--dry-run |
off | Print rollback SQL without executing it. |
--validate / --no-validate |
--validate |
Enable or disable preflight validation with EXPLAIN AST. |
Rollback uses the down SQL stored in db_migrations at the time the migration was applied, not the current file content.
show
Display applied migrations, pending migrations, HEAD, baseline markers, and integrity warnings.
migrator show
migrator show --all
| Option | Default | Description |
|---|---|---|
--all |
off | Show all applied migrations. By default, only the latest 5 applied migrations are shown. |
Example:
Applied:
[X] 20260421143000_add_events_table.sql (HEAD)
[X] 20260421140000_create_users_table.sql
Pending:
[ ] 20260421150000_add_status_column.sql
Applied: 2 | Pending: 1
Possible applied migration markers:
| Marker | Meaning |
|---|---|
HEAD |
Latest applied migration row. |
baseline |
Migration was recorded by baseline, not executed. |
modified |
Applied migration file exists but its checksum no longer matches. |
missing |
Applied migration file is missing locally. |
baseline
Adopt an existing ClickHouse database without replaying historical DDL.
migrator baseline
Use this when your database schema already exists and you have .sql migration files that represent that existing schema. Point the migrator at that directory with --path or CLICKHOUSE_MIGRATE_DIR, or put the files in the default ./db/migrations directory. Then run migrator baseline once and use migrator up for future migrations.
Baseline behavior:
- creates the
db_migrationsservice table if it does not exist; - requires
db_migrationsto have no rows; - marks all current
.sqlfiles in the migrations directory as already applied (baselinerows); - does not execute SQL;
- does not validate that the ClickHouse schema matches the files;
- baseline rows are not selected by
rollback; - baseline rows are excluded from checksum validation.
See Baseline existing databases.
repair
Update stored checksums to match current migration files.
migrator repair
Use this only after intentionally editing already-applied migration file(s) and confirming that the database state is still consistent with those edits. migrator show and migrator up report that applied migration SQL changed; repair updates the stored checksum to accept the current file content in future checks. It does not execute SQL, does not modify your application schema, and skips missing files.
lock-info
Show active migration lock information.
migrator lock-info
Example:
Locked by: runner-01:1234:abc123ef
Locked at: 2026-04-21 14:30:00
Expires at: 2026-04-21 14:40:00
force-unlock
Manually release a stuck lock.
migrator force-unlock
Use this when a deployment process crashed and the lock did not get released. Locks also expire automatically after their TTL.
Configuration
Global options can be provided through CLI flags or environment variables.
| CLI option | Environment variable | Default | Description |
|---|---|---|---|
--url |
CLICKHOUSE_MIGRATE_URL |
— | ClickHouse connection URL. Required for DB commands. |
--path |
CLICKHOUSE_MIGRATE_DIR |
./db/migrations |
Migrations directory. |
--cluster |
CLICKHOUSE_MIGRATE_CLUSTER |
— | ClickHouse cluster name for service table DDL. |
--connect-retries |
CLICKHOUSE_MIGRATE_CONNECT_RETRIES |
0 |
Connection retry attempts. |
--connect-retries-interval |
CLICKHOUSE_MIGRATE_CONNECT_RETRIES_INTERVAL |
1 |
Seconds between connection retries. |
--send-receive-timeout |
CLICKHOUSE_MIGRATE_SEND_RECEIVE_TIMEOUT |
600 |
ClickHouse client send/receive timeout in seconds. |
-v, --verbose |
— | off | Enable DEBUG logging. |
-q, --quiet |
— | off | Suppress INFO/WARNING logs; command output such as dry-run SQL is still printed. |
Connection URL example:
clickhouse://user:password@host:9000/database
For TLS/secure connections, use connection parameters supported by clickhouse-driver URLs, for example:
clickhouse://user:password@host:9440/database?secure=True
Docker
docker pull maksimburtsev/py-clickhouse-migrator:latest
Run migrations:
docker run --rm \
-v "$PWD/db/migrations:/migrations" \
-e CLICKHOUSE_MIGRATE_URL=clickhouse://default@clickhouse:9000/mydb \
maksimburtsev/py-clickhouse-migrator:latest \
up
Inside the Docker image, the default migrations directory is /migrations.
See Docker usage.
CI/CD
In CI/CD, the usual deployment step is intentionally simple:
migrator up
GitHub Actions example:
- name: Run ClickHouse migrations
run: migrator up
env:
CLICKHOUSE_MIGRATE_URL: ${{ secrets.CLICKHOUSE_MIGRATE_URL }}
Recommended deployment pattern:
- build and test application code;
- run
migrator uponce per deployment; - deploy application code that depends on the new schema.
The CLI exits with 0 on success and 1 for handled migration errors such as invalid migrations, connection failures, checksum mismatches, missing databases, and lock errors.
See CI/CD usage.
Checksum validation
When a migration is applied, PyClickHouseMigrator stores a SHA-256 checksum in db_migrations.
The checksum is computed from the up and down statement blocks extracted by the migrator, not from the migration file as a whole. This makes the checksum tied to the SQL blocks the tool will send to ClickHouse.
On migrator up, applied migration checksums are compared with the current local files. If an applied migration was modified or deleted, the command fails unless --allow-dirty is used.
Useful commands:
migrator show # display modified/missing markers
migrator up # fail fast on checksum mismatch
migrator up --allow-dirty
migrator repair # update stored checksums after intentional edits
Preflight validation
By default, up, rollback, and their dry-run variants validate statements with EXPLAIN AST before execution.
migrator up --no-validate
migrator rollback --no-validate
Validation is best-effort. It catches many syntax and parse problems early, but it is not a guarantee that execution will succeed and it is not a production-safety analyzer.
Locking
up, rollback, and baseline use an advisory migration lock by default.
The lock is stored in a ClickHouse service table named _migrations_lock. It has a TTL, a lock owner identity, retry behavior, and manual recovery commands.
migrator up --lock-ttl 900
migrator up --lock-retry 10
migrator up --no-lock
migrator lock-info
migrator force-unlock
The lock is meant to protect common deployment races, for example two CI jobs starting at the same time. It is still best practice to run migrations from a single deployment job or Kubernetes Job.
Cluster mode
Set a cluster name when you want the migrator's own service tables to be created across a ClickHouse cluster:
export CLICKHOUSE_MIGRATE_CLUSTER=my_cluster
migrator up
When cluster mode is enabled:
db_migrationsis created withON CLUSTERand a replicated engine;_migrations_lockis created withON CLUSTERand a replicated replacing engine;- service table writes use cluster consistency settings;
- your migration SQL is executed exactly as written.
PyClickHouseMigrator does not inject ON CLUSTER into user migrations. If your ClickHouse DDL must run on the whole cluster, write ON CLUSTER in the migration yourself.
See Cluster mode.
Python API
from py_clickhouse_migrator import Migrator
migrator = Migrator(
database_url="clickhouse://default@localhost:9000/mydb",
migrations_dir="./db/migrations",
)
migrator.up()
See Python API.
Known limitations
- ClickHouse DDL is not transactional. A multi-statement migration can partially apply if a later statement fails.
- The advisory lock is best-effort. It reduces common concurrency problems, but it is not a substitute for a single well-defined migration job.
- The target database must exist before the migrator runs.
- Baseline does not compare migration files with the existing database schema.
- Preflight validation is best-effort and can be disabled with
--no-validate. - Each
-- @stmtblock must contain one ClickHouse query.
See Known limitations.
Documentation
- Migration format
- Baseline existing databases
- Cluster mode
- CI/CD usage
- Docker usage
- Python API
- Troubleshooting
- Known limitations
- 2.0 release notes
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file py_clickhouse_migrator-2.0.0.tar.gz.
File metadata
- Download URL: py_clickhouse_migrator-2.0.0.tar.gz
- Upload date:
- Size: 1.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca11eb5d657146b0574e7889ce4b879e450de87c1ae56e7c6dd29584cea7da6e
|
|
| MD5 |
f2bf10c9ea03e0c8e476137eec7efbdc
|
|
| BLAKE2b-256 |
2f973d189b1e0e56ccf98e66483e4ac33e80e886a6f1dc952db72a6fb677f4b9
|
Provenance
The following attestation bundles were made for py_clickhouse_migrator-2.0.0.tar.gz:
Publisher:
release.yml on Maksim-Burtsev/PyClickHouseMigrator
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
py_clickhouse_migrator-2.0.0.tar.gz -
Subject digest:
ca11eb5d657146b0574e7889ce4b879e450de87c1ae56e7c6dd29584cea7da6e - Sigstore transparency entry: 1391617188
- Sigstore integration time:
-
Permalink:
Maksim-Burtsev/PyClickHouseMigrator@cd78ee585d8fdb141811bd59ab2aa88edfe6ab53 -
Branch / Tag:
refs/tags/v2.0.0 - Owner: https://github.com/Maksim-Burtsev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@cd78ee585d8fdb141811bd59ab2aa88edfe6ab53 -
Trigger Event:
push
-
Statement type:
File details
Details for the file py_clickhouse_migrator-2.0.0-py3-none-any.whl.
File metadata
- Download URL: py_clickhouse_migrator-2.0.0-py3-none-any.whl
- Upload date:
- Size: 21.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47d5cc5eb9cd0b7c7760de5fceab48c6d2ada0fea3c3d454444948f8395f0ed7
|
|
| MD5 |
bee3254db151edde7fcbf294fb9b9479
|
|
| BLAKE2b-256 |
87b408bde17bac4d8e437851bd4ce7b7b13af0cdf980e87bfdf8f52437b01b1e
|
Provenance
The following attestation bundles were made for py_clickhouse_migrator-2.0.0-py3-none-any.whl:
Publisher:
release.yml on Maksim-Burtsev/PyClickHouseMigrator
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
py_clickhouse_migrator-2.0.0-py3-none-any.whl -
Subject digest:
47d5cc5eb9cd0b7c7760de5fceab48c6d2ada0fea3c3d454444948f8395f0ed7 - Sigstore transparency entry: 1391617245
- Sigstore integration time:
-
Permalink:
Maksim-Burtsev/PyClickHouseMigrator@cd78ee585d8fdb141811bd59ab2aa88edfe6ab53 -
Branch / Tag:
refs/tags/v2.0.0 - Owner: https://github.com/Maksim-Burtsev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@cd78ee585d8fdb141811bd59ab2aa88edfe6ab53 -
Trigger Event:
push
-
Statement type: