Skip to main content

ClickHouse database adapter for Datus

Project description

datus-clickhouse

ClickHouse database adapter for Datus.

Installation

pip install datus-clickhouse

This will automatically install the required dependencies:

  • datus-agent
  • datus-sqlalchemy
  • clickhouse-sqlalchemy

Usage

The adapter is automatically registered with Datus when installed. Configure your database connection in your Datus configuration:

database:
  type: clickhouse
  host: localhost
  port: 8123
  username: default
  password: your_password
  database: your_database

Or use programmatically:

from datus_clickhouse import ClickHouseConfig, ClickHouseConnector

# Using config object
config = ClickHouseConfig(
    host="localhost",
    port=8123,
    username="default",
    password="your_password",
    database="mydb",
)
connector = ClickHouseConnector(config)

# Or using dict
connector = ClickHouseConnector({
    "host": "localhost",
    "port": 8123,
    "username": "default",
    "password": "your_password",
    "database": "mydb",
})

# Test connection
connector.test_connection()

# Execute query
result = connector.execute({"sql_query": "SELECT * FROM users LIMIT 10"})
print(result.sql_return)

# Get table list
tables = connector.get_tables()
print(f"Tables: {tables}")

# Get table schema
schema = connector.get_schema(table_name="users")
for column in schema:
    print(f"{column['name']}: {column['type']}")

Configuration Options

Option Type Default Description
host str "localhost" ClickHouse server host
port int 8123 ClickHouse HTTP port
username str (required) Username
password str "" Password
database str None Default database
timeout_seconds int 30 Connection timeout

Features

  • Query execution via ClickHouse SQL (SELECT)
  • DDL execution (CREATE, ALTER, DROP)
  • DML operations (INSERT, ALTER TABLE UPDATE, DELETE)
  • Metadata retrieval (databases, tables, views, columns)
  • Sample data extraction
  • Multiple result formats (pandas, arrow, csv, list)
  • Connection pooling and management
  • Comprehensive error handling

Testing

Quick Start

cd datus-clickhouse

# Unit tests (no database required)
uv run pytest tests/ -m "not integration" -v

# All tests with coverage
uv run pytest tests/ -v --cov=datus_clickhouse --cov-report=term-missing

Integration Tests (Requires ClickHouse Server)

# Start ClickHouse container
docker compose up -d

# Wait for container to become healthy (~15s)
docker compose ps

# Run integration tests
uv run pytest tests/integration/ -v

# Run only TPC-H tests
uv run pytest tests/integration/test_tpch.py -v

# Run acceptance tests (core functionality)
uv run pytest tests/ -m acceptance -v

# Stop ClickHouse
docker compose down

TPC-H Test Data

Integration tests include TPC-H benchmark data for realistic query testing. The tpch_setup fixture (session-scoped) automatically creates 5 tables with sample data:

Table Rows Description
tpch_region 5 Standard TPC-H regions
tpch_nation 25 Standard TPC-H nations
tpch_customer 10 Simplified customer data
tpch_orders 15 Simplified order data
tpch_supplier 5 Simplified supplier data

Tables are created at the start of the test session and dropped after all tests complete.

Initialize TPC-H Data Manually

To create TPC-H data for use with Datus (outside of tests):

# Basic usage
uv run python scripts/init_tpch_data.py

# Drop existing tables and re-create
uv run python scripts/init_tpch_data.py --drop

# Custom connection
uv run python scripts/init_tpch_data.py --host 192.168.1.100 --port 8123 --username admin --password secret

Test Statistics

  • Unit Tests: 45 tests (config validation, connector logic, identifiers)
  • Integration Tests: 20 tests (connection, metadata, SQL execution)
  • TPC-H Tests: 9 tests (metadata queries, joins, aggregations, CSV format)
  • Total: 74 tests

Test Markers

Marker Description
integration Requires running ClickHouse server
acceptance Core functionality validation for CI/CD

Development

Setup

# From workspace root
uv sync --all-packages

# Or install in editable mode
uv pip install -e .

Code Quality

black datus_clickhouse tests
isort datus_clickhouse tests
ruff check datus_clickhouse tests

ClickHouse SQL Notes

ClickHouse has some syntax differences from standard SQL:

  • UPDATE: Use ALTER TABLE <table> UPDATE ... WHERE ... instead of UPDATE <table> SET ...
  • DELETE: Supports lightweight deletes with DELETE FROM <table> WHERE ...
  • Identifiers: Use backticks for quoting: `database`.`table`
  • No schema layer: Databases serve as schemas; there is no separate schema concept

Requirements

  • Python >= 3.12
  • ClickHouse >= 20.1
  • datus-agent > 0.2.1
  • datus-sqlalchemy >= 0.1.0
  • clickhouse-sqlalchemy >= 0.3.2

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datus_clickhouse-0.1.0.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datus_clickhouse-0.1.0-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file datus_clickhouse-0.1.0.tar.gz.

File metadata

  • Download URL: datus_clickhouse-0.1.0.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for datus_clickhouse-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9c3f31845b7d4f2571f19ae494d4c004aec895440a9c055b1e394d7c3b9ccab5
MD5 f9923355094e3c9487251f6b3db68aa3
BLAKE2b-256 187390085c29415552ebc8af5dbf97481570749bc92b7a84155ee1cbaa1af2ab

See more details on using hashes here.

File details

Details for the file datus_clickhouse-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for datus_clickhouse-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 944cc576ffb3483f9c886989de665a470dc703dfa028af663401eaf74b7b587a
MD5 420bd7e7c9f7d40a58679ddfda4b2b24
BLAKE2b-256 ae76252a0c25ee1ec0d142c6da18c65838820c75f73ce9ee31bf692353b5f2e2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page