Skip to main content

Validate a semql Catalog against a live database — catches missing tables, dropped columns, and broken join predicates before a deploy.

Project description

semql-validate-db

Pre-deploy drift checker for semql catalogs. Runs cheap probe queries against a live database and surfaces the class of bugs the compiler can't see — missing tables, dropped columns, broken join predicates, base-predicate drift.

semql is intentionally pure (PHILOSOPHY: "the compiler has no I/O"). That keeps the compiler simple, but it also means a catalog can pass every compile-time check and still blow up at query time because upstream renamed a column. semql-validate-db is the out-of-band gate that catches it.

Use this for ongoing drift detection on a catalog you already authored. For greenfield scaffolding from a database's existing schema, see semql-introspect — it generates Cube stubs from the information schema, which is the opposite direction: introspect goes DB → catalog, validate-db goes catalog → DB.

Install

pip install semql-validate-db

The package is driver-agnostic. Bring your own DB-API 2.0 connection:

pip install psycopg              # Postgres
pip install clickhouse-connect   # ClickHouse
pip install duckdb               # DuckDB

Quick start

import duckdb
from semql import Dialect, Catalog, Cube, Dimension, Measure, TimeDimension
from semql_validate_db import validate_against_db

orders = Cube(
    name="orders",
    dialect=Dialect.DUCKDB,
    table="orders",
    alias="o",
    measures=[Measure(name="revenue", sql="{o}.amount", agg="sum")],
    dimensions=[Dimension(name="region", sql="{o}.region", type="string")],
    time_dimensions=[TimeDimension(name="created_at", sql="{o}.created_at")],
)
catalog = Catalog([orders])

conn = duckdb.connect(":memory:")
conn.execute(
    "CREATE TABLE orders (amount DOUBLE, region TEXT, created_at TIMESTAMP)"
)

errors = validate_against_db(catalog, connection=conn)
for e in errors:
    print(f"{e.code}: {e.cube}.{e.field or ''}{e.message}")

A clean run returns an empty list. Drift (a missing column, a renamed table) yields one DbValidationError per finding so a single run gives the full picture instead of bailing on the first failure.

What it catches

  • missing_tablecube.table doesn't exist or the connection's role can't see it.
  • missing_column — a measure / dimension / time-dimension SQL fragment references a column that no longer exists.
  • base_predicate_invalidcube.base_predicate doesn't execute.
  • join_predicate_invalid — a Join.on predicate references columns that aren't there, or compares incompatible types.

(A required_filters entry that names no real dimension is now rejected at catalog construction — it can't reach this pre-deploy stage, so there's no DB-level check for it.)

What it doesn't catch

  • Semantic drift (a column exists but means something different now). Schema is necessary, not sufficient.
  • Cross-table referential integrity. The probes are LIMIT 0; they parse, they don't sample.
  • Dialect-specific feature drift (a function got dialect-renamed). Use the compiler's snapshot tests for that.

Why LIMIT 0?

Every probe runs SELECT … LIMIT 0. The query planner type-checks identifiers and predicates but does no row work, so the cost is microseconds per probe — fine for a per-cube fan-out in CI. The trade-off is that purely runtime drift (e.g. an enum value that got dropped from a check constraint) won't surface here.

CLI

The package is library-first; a CLI lives in callers' deploy scripts where the connection / DSN / role are already known.

Status

Phase A: probe-by-fragment shape. Drift findings are accurate; performance is "fine for CI, not for runtime gates."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semql_validate_db-0.4.0.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semql_validate_db-0.4.0-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file semql_validate_db-0.4.0.tar.gz.

File metadata

  • Download URL: semql_validate_db-0.4.0.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for semql_validate_db-0.4.0.tar.gz
Algorithm Hash digest
SHA256 1fa3a1ca05e541c0f330f8be4052591b76d8a574a67347710bc199770bdd796c
MD5 89d5dca6594eab1a7a7f1aa56e54da46
BLAKE2b-256 e21a917433f4b180b74f61e2da8817112ffb57810a19f964f260d06a51c4e3ad

See more details on using hashes here.

File details

Details for the file semql_validate_db-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: semql_validate_db-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for semql_validate_db-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7f22d87d1ccc826d59aac92e8e2f632dbb512b102c5c3afb02058949e045512a
MD5 a132dd7e97d2a28040934f76b068617e
BLAKE2b-256 0fe5705a0eb35945b80c8b6f83a920d2cf78736120273600862727856e5d39a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page