Skip to main content

Convert PostgreSQL DDL to ClickHouse DDL

Project description

pg2ch - PostgreSQL to ClickHouse DDL Converter

PyPI version License

Effortlessly migrate your PostgreSQL schemas to ClickHouse 🔄

Convert PostgreSQL DDL statements to ClickHouse format with intelligent type mapping, constraint handling, and schema optimization. Perfect for data migrations, analytics workflows, and multi-database architectures.

Features

  • 🎯 Smart Type Mapping - Automatic PostgreSQL → ClickHouse type conversion
  • 🔑 Primary Key Detection - Generates optimal ORDER BY clauses
  • 🛡️ Constraint Handling - Preserves NOT NULL, UNIQUE, DEFAULT values
  • 📝 Schema Support - Handles public.table notation seamlessly
  • 🔧 CLI Interface - Batch process multiple DDL files
  • DDL Validation - Optional syntax validation with ClickHouse Local
  • 📊 Metadata Extraction - Detailed table and column information

🚀 Quick Start

Installation

pip install pg2ch

Basic Usage

from pg2ch import convert_ddl

# Define the PostgreSQL DDL as a string
postgres_ddl = """
CREATE TABLE IF NOT EXISTS public.transactions (
    transaction_id    BIGSERIAL PRIMARY KEY,
    user_id           INTEGER NOT NULL,
    account_id        INTEGER NOT NULL,
    transaction_type  VARCHAR(20) NOT NULL,
    amount            DECIMAL(15,2) NOT NULL,
    currency          CHAR(3) DEFAULT 'USD',
    description       TEXT,
    created_at        TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);
"""

# Convert the PostgreSQL DDL to ClickHouse DDL
clickhouse_ddl = convert_ddl(postgres_ddl)

# Print the resulting ClickHouse DDL
print("--- PostgreSQL DDL ---")
print(postgres_ddl)
print("\n--- Converted ClickHouse DDL ---")
print(clickhouse_ddl)

Output:

--- PostgreSQL DDL ---
CREATE TABLE IF NOT EXISTS public.transactions
(
    transaction_id   BIGSERIAL PRIMARY KEY,
    user_id          INTEGER NOT NULL,
    account_id       INTEGER NOT NULL,
    transaction_type VARCHAR(20) NOT NULL,
    amount           DECIMAL(15, 2) NOT NULL,
    currency         CHAR(3) DEFAULT 'USD',
    description      TEXT,
    created_at       TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

--- Converted ClickHouse DDL ---
CREATE TABLE IF NOT EXISTS transactions
(
    transaction_id   Int64,
    user_id          Int32,
    account_id       Int32,
    transaction_type String,
    amount           Decimal64(4),
    currency         Nullable(String)   DEFAULT 'USD',
    description      Nullable(String),
    created_at       Nullable(DateTime) DEFAULT NOW()
)
    ENGINE = MergeTree()
        ORDER BY (transaction_id);

🎯 Advanced Usage

Parse and Inspect Metadata

...

parser = PostgreSQLParser()
tables = parser.parse_ddl(postgres_ddl)

for table in tables:
    # table.to_json()
    # table.to_dict()
    # table.print_json()
    # table.get_column()

    # Access individual table properties
    print(f"Table: {table.name}")
    print(f"Columns: {len(table.columns)}")
    print(f"Primary Keys: {table.primary_keys}")

CLI Usage

# Convert a single file
pg2ch schema.sql

# Save to output file
pg2ch schema.sql --output clickhouse_schema.sql

# With validation (requires clickhouse-local)
pg2ch schema.sql --validate

DDL Validation

from pg2ch import convert_ddl, validate_clickhouse_ddl_with_local

postgres_ddl = """
CREATE TABLE IF NOT EXISTS public.transactions (
    transaction_id    BIGSERIAL PRIMARY KEY,
    user_id           INTEGER NOT NULL,
    account_id        INTEGER NOT NULL,
    transaction_type  VARCHAR(20) NOT NULL,
    amount            DECIMAL(15,2) NOT NULL,
    currency          CHAR(3) DEFAULT 'USD',
    description       TEXT,
    created_at        TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

"""

""" Run with 'validate=True' """

clickhouse_ddl = convert_ddl(postgres_ddl, validate=True)

""" or, use function """

is_valid, message = validate_clickhouse_ddl_with_local(clickhouse_ddl)
print(message)  # ✅ DDL is valid

📋 Type Mapping Reference

PostgreSQL ClickHouse Notes
SERIAL Int32 Auto-increment
BIGSERIAL Int64 Large auto-increment
VARCHAR(n) String Variable length text
TEXT String Unlimited text
INTEGER Int32 32-bit integer
BIGINT Int64 64-bit integer
BOOLEAN Bool True/false values
DECIMAL(p,s) Decimal64(4) Fixed precision
TIMESTAMP DateTime Date and time
TIMESTAMP WITH TIME ZONE DateTime Timezone-aware timestamps
JSONB String JSON data as string
UUID String UUID as string

🧪 Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=pg2ch

# Run specific test
pytest tests/test_transactions_table.py -v

📚 Documentation

📈 Roadmap

  • Foreign Key Support - Convert foreign key constraints
  • Index Migration - Transform PostgreSQL indexes
  • Partitioning Support - Migrate table partitioning schemes
  • GUI Interface - Web-based conversion tool

🐛 Known Limitations

  • Complex constraints (CHECK with subqueries) are simplified
  • PostgreSQL-specific functions may need manual adjustment
  • Some advanced PostgreSQL types require custom mapping

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.


Made with ❤️ for the data engineering community

⭐ Star on GitHub🐛 Report Bug

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pg2ch-0.1.1.tar.gz (22.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pg2ch-0.1.1-py3-none-any.whl (19.2 kB view details)

Uploaded Python 3

File details

Details for the file pg2ch-0.1.1.tar.gz.

File metadata

  • Download URL: pg2ch-0.1.1.tar.gz
  • Upload date:
  • Size: 22.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for pg2ch-0.1.1.tar.gz
Algorithm Hash digest
SHA256 9e3208c74c9a6939be4d7a3f0327eb64c79a025069a6f60a3ba560a6e235a497
MD5 a4d12f44aa36cf1f5d10ff8ca83ade61
BLAKE2b-256 9ed88ec2642ef4c82143951905c42643c11410c6faaed4fd599cedcf77df747e

See more details on using hashes here.

File details

Details for the file pg2ch-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pg2ch-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 19.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for pg2ch-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 83a62db4b6106bdc2e305c61d5ea51bcbdd3562bd9b9cb13021a01db5b23c2fd
MD5 c4525f1a4e2e6fad8a9def5a99cba403
BLAKE2b-256 8c3f4c0d63785660c52bfd234e7713e28f3fb1bcbc8cc95bd1ac15f1cd7a5f3f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page