Skip to main content

MCP server for ClickHouse analytics — schema discovery, data exploration, storage & engines, performance, data quality, multi-env support

Project description

ch-analytics-mcp

An MCP server for ClickHouse analytics — a general-purpose "DBA-lite" toolkit that gives any MCP client (Claude Code, Cursor, etc.) instant visibility into schema structure, data quality, storage engines, performance, and multi-environment comparison.

What it does

Exposes 20 read-only tools (+ 4 optional) organised in 6 categories:

Schema Discovery

  • database_summary — high-level overview: database/table/view counts, engine distribution, compression stats
  • scan_databases — row counts for every table (instant from metadata, no COUNT(*))
  • describe_table — column details: name, type, nullable, default, compression codec, comment
  • table_sizes — disk usage (compressed + uncompressed) per table with compression ratio
  • find_tables — search tables by name pattern (ILIKE)
  • find_columns — find tables that have a column matching a pattern
  • list_empty_tables — quickly find tables with 0 rows
  • list_environments — list configured environments

Data Exploration

  • recent_rows — peek at the most recent rows (auto-detects timestamp or sorting key)
  • column_value_counts — distinct values and frequencies for a column
  • column_stats — min, max, avg, null count, distinct count for a column

Storage & Engines

  • table_engines — list tables by engine (MergeTree, Distributed, etc.) with sorting/partition/primary keys
  • partition_info — partition details: part count, rows, compressed/uncompressed size, date range
  • compare_envs — compare row counts across DEV / STG / PROD

Performance

  • data_skipping_indices — list data skipping indices (minmax, set, bloom_filter, etc.)
  • slow_query_candidates — find recent slow queries from system.query_log
  • parts_analysis — analyse part counts and compression (merge pressure detection)

Data Quality

  • table_health — row count + last inserted_at/updated_at for a table
  • null_report — null percentage for every column in a table
  • duplicate_check — find duplicate rows based on a set of columns

Fail Table Tracking (opt-in via CH_FAIL_DATABASE)

These 4 tools are only registered when CH_FAIL_DATABASE is set. They discover *_fails tables in the specified database that have run_id, stage, comment, failed_at columns.

  • pipeline_fail_tables — discover all *_fails tables with row counts and stats
  • pipeline_fail_summary — cross-entity failure summary grouped by entity, stage, or both
  • pipeline_fail_details — drill into a specific entity's fail table with optional filters
  • pipeline_fail_runs — analyse which runs generated the most failures

Quick start

Prerequisites

  • Python 3.12+
  • uv (recommended) or pip
  • One or more ClickHouse instances

Install

Option A — run directly with uvx (no clone needed):

uvx ch-analytics-mcp

Option B — clone and run:

git clone https://github.com/fabdendev/ch-analytics-mcp.git
cd ch-analytics-mcp
uv sync

Configure

Set environment variables for each ClickHouse environment (at least one is required):

Variable Description Required
CH_LOCAL_URL ClickHouse URL for LOCAL At least one URL
CH_DEV_URL ClickHouse URL for DEV At least one URL
CH_STG_URL ClickHouse URL for STG Optional
CH_PROD_URL ClickHouse URL for PROD Optional
CH_INCLUDE_DATABASES Comma-separated allowlist of databases to scan Optional
CH_IGNORE_DATABASES Comma-separated databases to skip (added to internal exclusions) Optional
CH_FAIL_DATABASE Database containing *_fails tables — enables pipeline_fail_* tools Optional
CH_READ_ONLY Reserved for future write tools (not yet used) Optional
export CH_DEV_URL="clickhouse://user:pass@host:8123/default"
export CH_STG_URL="clickhouse://user:pass@host:8123/default"   # optional
export CH_PROD_URL="clickhouse://user:pass@host:8123/default"  # optional

# Database filtering (optional — pick one, not both)
export CH_INCLUDE_DATABASES="core,trading,pipeline"  # only scan these
export CH_IGNORE_DATABASES="tmp,scratch"              # skip these

URL format: clickhouse://user:pass@host:port/database (HTTP interface, port 8123). For HTTPS: https://user:pass@host:port/database (port 8443).

If both CH_INCLUDE_DATABASES and CH_IGNORE_DATABASES are set, the include list takes precedence.

Add to Claude Code

Add to your Claude Code MCP settings (~/.claude/settings.json or .mcp.json):

If using uvx:

{
  "mcpServers": {
    "ch-analytics": {
      "command": "uvx",
      "args": ["ch-analytics-mcp"],
      "env": {
        "CH_DEV_URL": "clickhouse://user:pass@host:8123/default",
        "CH_STG_URL": "clickhouse://user:pass@host:8123/default"
      }
    }
  }
}

If installed from clone:

{
  "mcpServers": {
    "ch-analytics": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/ch-analytics-mcp", "ch-analytics-mcp"],
      "env": {
        "CH_DEV_URL": "clickhouse://user:pass@host:8123/default"
      }
    }
  }
}

Security

All tools are read-only. No data is ever modified. Additional safeguards:

  • Identifier validation — all user-provided database/table/column names are validated against ^[a-zA-Z_][a-zA-Z0-9_]*$ and backtick-quoted
  • Row limits — row-level queries capped at 100, aggregation queries at 200
  • Query timeout — potentially expensive queries (null_report, column_stats, duplicate_check, column_value_counts) use a 30s max_execution_time
  • Direction validation — order_dir restricted to ASC/DESC only

Multi-environment support

Configure up to 4 environments (LOCAL, DEV, STG, PROD). The first configured environment becomes the default. Use compare_envs to quickly spot row count differences across environments.

Development

uv sync --extra dev
uv run ruff check ch_analytics_mcp/    # lint
uv run python -m ch_analytics_mcp      # start server locally

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ch_analytics_mcp-0.1.0.tar.gz (82.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ch_analytics_mcp-0.1.0-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file ch_analytics_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: ch_analytics_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 82.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ch_analytics_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7e16a86fabbc7f813566c57a4c32ff7e6fdae7205b722d7f5a0fb8152a0818de
MD5 24b2cb7d484699518c3375a5f85b7333
BLAKE2b-256 e4ab337bc23f1d721044917ca651435b72508c67df4a100a302df6d3b6e3551b

See more details on using hashes here.

File details

Details for the file ch_analytics_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ch_analytics_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ch_analytics_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 51a3ef019375f01dd2d64b34e43a971804c93f58832d276ab2f5bc4570a025f8
MD5 8aff0165908137368f23dc69e5b9db2d
BLAKE2b-256 7fb7e731799ec3e98eece5801343fff9ddefe2246d71e8a76f638ba3dd67d8b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page