Skip to main content

Validate a semql Catalog against a live database — catches missing tables, dropped columns, and broken join predicates before a deploy.

Project description

semql-validate-db

Pre-deploy drift checker for semql catalogs. Runs cheap probe queries against a live database and surfaces the class of bugs the compiler can't see — missing tables, dropped columns, broken join predicates, base-predicate drift.

semql is intentionally pure (PHILOSOPHY: "the compiler has no I/O"). That keeps the compiler simple, but it also means a catalog can pass every compile-time check and still blow up at query time because upstream renamed a column. semql-validate-db is the out-of-band gate that catches it.

Use this for ongoing drift detection on a catalog you already authored. For greenfield scaffolding from a database's existing schema, see semql-introspect — it generates Cube stubs from the information schema, which is the opposite direction: introspect goes DB → catalog, validate-db goes catalog → DB.

Install

pip install semql-validate-db

The package is driver-agnostic. Bring your own DB-API 2.0 connection:

pip install psycopg              # Postgres
pip install clickhouse-connect   # ClickHouse
pip install duckdb               # DuckDB

Quick start

import duckdb
from semql import Dialect, Catalog, Cube, Dimension, Measure, TimeDimension
from semql_validate_db import validate_against_db

orders = Cube(
    name="orders",
    dialect=Dialect.DUCKDB,
    table="orders",
    alias="o",
    measures=[Measure(name="revenue", sql="{o}.amount", agg="sum")],
    dimensions=[Dimension(name="region", sql="{o}.region", type="string")],
    time_dimensions=[TimeDimension(name="created_at", sql="{o}.created_at")],
)
catalog = Catalog([orders])

conn = duckdb.connect(":memory:")
conn.execute(
    "CREATE TABLE orders (amount DOUBLE, region TEXT, created_at TIMESTAMP)"
)

errors = validate_against_db(catalog, connection=conn)
for e in errors:
    print(f"{e.code}: {e.cube}.{e.field or ''}{e.message}")

A clean run returns an empty list. Drift (a missing column, a renamed table) yields one DbValidationError per finding so a single run gives the full picture instead of bailing on the first failure.

What it catches

  • missing_tablecube.table doesn't exist or the connection's role can't see it.
  • missing_column — a measure / dimension / time-dimension SQL fragment references a column that no longer exists.
  • base_predicate_invalidcube.base_predicate doesn't execute.
  • join_predicate_invalid — a Join.on predicate references columns that aren't there, or compares incompatible types.

(A required_filters entry that names no real dimension is now rejected at catalog construction — it can't reach this pre-deploy stage, so there's no DB-level check for it.)

What it doesn't catch

  • Semantic drift (a column exists but means something different now). Schema is necessary, not sufficient.
  • Cross-table referential integrity. The probes are LIMIT 0; they parse, they don't sample.
  • Dialect-specific feature drift (a function got dialect-renamed). Use the compiler's snapshot tests for that.

Why LIMIT 0?

Every probe runs SELECT … LIMIT 0. The query planner type-checks identifiers and predicates but does no row work, so the cost is microseconds per probe — fine for a per-cube fan-out in CI. The trade-off is that purely runtime drift (e.g. an enum value that got dropped from a check constraint) won't surface here.

CLI

The package is library-first; a CLI lives in callers' deploy scripts where the connection / DSN / role are already known.

Status

Phase A: probe-by-fragment shape. Drift findings are accurate; performance is "fine for CI, not for runtime gates."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semql_validate_db-0.5.0.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semql_validate_db-0.5.0-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file semql_validate_db-0.5.0.tar.gz.

File metadata

  • Download URL: semql_validate_db-0.5.0.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for semql_validate_db-0.5.0.tar.gz
Algorithm Hash digest
SHA256 4921a863e8d6f8b8d586e707d3a85ef74cb9582af82e0f5a652eee190618211d
MD5 3bf738818b5a4dad7b76ad61a19561b9
BLAKE2b-256 7a6217ab1a4e44cc1d739b7d2e4bd1946a5aba9fdb0622f902dc30f4610969e5

See more details on using hashes here.

File details

Details for the file semql_validate_db-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: semql_validate_db-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for semql_validate_db-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 975f112dda378e14db135a764c53e927191eb30b2358ef21c3706fce59574b29
MD5 c9f52414d2f521c6ce4830141df5304c
BLAKE2b-256 8bebbeebf41475688e4589c58059d5aea75889643acc73de93bdd8b21a40569e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page