Skip to main content

MCP server for ClickHouse analytics — schema discovery, data exploration, storage & engines, performance, data quality, multi-env support

Project description

ch-analytics-mcp

An MCP server for ClickHouse analytics — a general-purpose "DBA-lite" toolkit that gives any MCP client (Claude Code, Cursor, etc.) instant visibility into schema structure, data quality, storage engines, performance, multi-environment comparison, and data modeling.

What it does

Exposes 30 read-only tools organised in 6 categories:

Schema Discovery

  • database_summary — high-level overview: database/table/view counts, engine distribution, compression stats
  • scan_databases — row counts for every table (instant from metadata, no COUNT(*))
  • describe_table — column details: name, type, nullable, default, compression codec, comment
  • table_sizes — disk usage (compressed + uncompressed) per table with compression ratio
  • find_tables — search tables by name pattern (ILIKE)
  • find_columns — find tables that have a column matching a pattern
  • list_empty_tables — quickly find tables with 0 rows
  • list_environments — list configured environments

Data Exploration

  • recent_rows — peek at the most recent rows (auto-detects timestamp or sorting key)
  • column_value_counts — distinct values and frequencies for a column
  • column_stats — min, max, avg, null count, distinct count for a column

Storage & Engines

  • table_engines — list tables by engine (MergeTree, Distributed, etc.) with sorting/partition/primary keys
  • partition_info — partition details: part count, rows, compressed/uncompressed size, date range
  • compare_envs — compare row counts across DEV / STG / PROD

Performance

  • data_skipping_indices — list data skipping indices (minmax, set, bloom_filter, etc.)
  • slow_query_candidates — find recent slow queries from system.query_log
  • parts_analysis — analyse part counts and compression (merge pressure detection)

Data Quality

  • table_health — row count + last inserted_at/updated_at for a table
  • null_report — null percentage for every column in a table
  • duplicate_check — find duplicate rows based on a set of columns

Data Modeling & Discovery

  • infer_relationships — scan *_id columns and infer foreign-key relationships by matching table names
  • column_overlap — find column names appearing in multiple tables (join keys, shared dimensions)
  • data_profile — one-shot full profile per column: type, cardinality, null%, uniqueness, min/max, samples
  • sample_values — distinct values with frequency, data type, avg length, and uniqueness flag
  • cardinality_check — determine relationship type (1:1, 1:N, N:1, N:M) between two tables via a join column
  • key_candidates — identify potential primary/natural key columns based on uniqueness and null rate
  • naming_conventions — analyze column naming patterns (suffixes like _id, _at, _type, etc.)
  • type_census — distribution of base data types across all visible tables
  • temporal_columns — find all Date/DateTime columns with their time ranges
  • table_activity — show table activity based on part modification times (active vs stale)

Quick start

Prerequisites

  • Python 3.12+
  • uv (recommended) or pip
  • One or more ClickHouse instances

Install

Option A — run directly with uvx (no clone needed):

uvx ch-analytics-mcp

Option B — clone and run:

git clone https://github.com/fabdendev/ch-analytics-mcp.git
cd ch-analytics-mcp
uv sync

Configure

Set environment variables for each ClickHouse environment (at least one is required):

Variable Description Required
CH_LOCAL_URL ClickHouse URL for LOCAL At least one URL
CH_DEV_URL ClickHouse URL for DEV At least one URL
CH_STG_URL ClickHouse URL for STG Optional
CH_PROD_URL ClickHouse URL for PROD Optional
CH_INCLUDE_DATABASES Comma-separated allowlist of databases to scan Optional
CH_IGNORE_DATABASES Comma-separated databases to skip (added to internal exclusions) Optional
CH_READ_ONLY Reserved for future write tools (not yet used) Optional
export CH_DEV_URL="clickhouse://user:pass@host:8123/default"
export CH_STG_URL="clickhouse://user:pass@host:8123/default"   # optional
export CH_PROD_URL="clickhouse://user:pass@host:8123/default"  # optional

# Database filtering (optional — pick one, not both)
export CH_INCLUDE_DATABASES="core,trading,pipeline"  # only scan these
export CH_IGNORE_DATABASES="tmp,scratch"              # skip these

URL format: clickhouse://user:pass@host:port/database (HTTP interface, port 8123). For HTTPS: https://user:pass@host:port/database (port 8443).

If both CH_INCLUDE_DATABASES and CH_IGNORE_DATABASES are set, the include list takes precedence.

Add to Claude Code

Add to your Claude Code MCP settings (~/.claude/settings.json or .mcp.json):

If using uvx:

{
  "mcpServers": {
    "ch-analytics": {
      "command": "uvx",
      "args": ["ch-analytics-mcp"],
      "env": {
        "CH_DEV_URL": "clickhouse://user:pass@host:8123/default",
        "CH_STG_URL": "clickhouse://user:pass@host:8123/default"
      }
    }
  }
}

If installed from clone:

{
  "mcpServers": {
    "ch-analytics": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/ch-analytics-mcp", "ch-analytics-mcp"],
      "env": {
        "CH_DEV_URL": "clickhouse://user:pass@host:8123/default"
      }
    }
  }
}

Security

All tools are read-only. No data is ever modified. Additional safeguards:

  • Identifier validation — all user-provided database/table/column names are validated against ^[a-zA-Z_][a-zA-Z0-9_]*$ and backtick-quoted
  • Row limits — row-level queries capped at 100, aggregation queries at 200
  • Query timeout — potentially expensive queries (null_report, column_stats, duplicate_check, column_value_counts, data_profile, sample_values, cardinality_check, key_candidates, temporal_columns) use a 30s max_execution_time
  • Direction validation — order_dir restricted to ASC/DESC only

Multi-environment support

Configure up to 4 environments (LOCAL, DEV, STG, PROD). The first configured environment becomes the default. Use compare_envs to quickly spot row count differences across environments.

Development

uv sync --extra dev
uv run ruff check ch_analytics_mcp/    # lint
uv run python -m ch_analytics_mcp      # start server locally

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ch_analytics_mcp-0.2.0.tar.gz (84.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ch_analytics_mcp-0.2.0-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file ch_analytics_mcp-0.2.0.tar.gz.

File metadata

  • Download URL: ch_analytics_mcp-0.2.0.tar.gz
  • Upload date:
  • Size: 84.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ch_analytics_mcp-0.2.0.tar.gz
Algorithm Hash digest
SHA256 81c2317098662064cc5772e6b2e70b3433c6cd94eb17b495d7dad47c6f826e3f
MD5 f4dd01652ab1b940a45798d90365453e
BLAKE2b-256 419225dc984db50d8dc911ef3e47c81e7d44b675a84cc7e224e958ad47d60654

See more details on using hashes here.

File details

Details for the file ch_analytics_mcp-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: ch_analytics_mcp-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 17.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ch_analytics_mcp-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6e36934deaadc1000e99f5d90c2cd74e27b2c71702c16b785f37587c145b1f5b
MD5 1ea6e249c47751e03a3a7ade3143a1e7
BLAKE2b-256 ab0ea72e2136e1ea56cb85a269cb224438edebaac9231fd66a12934a44a87838

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page