Skip to main content

Validate a semql Catalog against a live database — catches missing tables, dropped columns, and broken join predicates before a deploy.

Project description

semql-validate-db

Pre-deploy drift checker for semql catalogs. Runs cheap probe queries against a live database and surfaces the class of bugs the compiler can't see — missing tables, dropped columns, broken join predicates, base-predicate drift.

semql is intentionally pure (PHILOSOPHY: "the compiler has no I/O"). That keeps the compiler simple, but it also means a catalog can pass every compile-time check and still blow up at query time because upstream renamed a column. semql-validate-db is the out-of-band gate that catches it.

Use this for ongoing drift detection on a catalog you already authored. For greenfield scaffolding from a database's existing schema, see semql-introspect — it generates Cube stubs from the information schema, which is the opposite direction: introspect goes DB → catalog, validate-db goes catalog → DB.

Install

pip install semql-validate-db

The package is driver-agnostic. Bring your own DB-API 2.0 connection:

pip install psycopg              # Postgres
pip install clickhouse-connect   # ClickHouse
pip install duckdb               # DuckDB

Quick start

import duckdb
from semql import Dialect, Catalog, Cube, Dimension, Measure, TimeDimension
from semql_validate_db import validate_against_db

orders = Cube(
    name="orders",
    dialect=Dialect.DUCKDB,
    table="orders",
    alias="o",
    measures=[Measure(name="revenue", sql="{o}.amount", agg="sum")],
    dimensions=[Dimension(name="region", sql="{o}.region", type="string")],
    time_dimensions=[TimeDimension(name="created_at", sql="{o}.created_at")],
)
catalog = Catalog([orders])

conn = duckdb.connect(":memory:")
conn.execute(
    "CREATE TABLE orders (amount DOUBLE, region TEXT, created_at TIMESTAMP)"
)

errors = validate_against_db(catalog, connection=conn)
for e in errors:
    print(f"{e.code}: {e.cube}.{e.field or ''}{e.message}")

A clean run returns an empty list. Drift (a missing column, a renamed table) yields one DbValidationError per finding so a single run gives the full picture instead of bailing on the first failure.

What it catches

  • missing_tablecube.table doesn't exist or the connection's role can't see it.
  • missing_column — a measure / dimension / time-dimension SQL fragment references a column that no longer exists.
  • base_predicate_invalidcube.base_predicate doesn't execute.
  • join_predicate_invalid — a Join.on predicate references columns that aren't there, or compares incompatible types.

(A required_filters entry that names no real dimension is now rejected at catalog construction — it can't reach this pre-deploy stage, so there's no DB-level check for it.)

What it doesn't catch

  • Semantic drift (a column exists but means something different now). Schema is necessary, not sufficient.
  • Cross-table referential integrity. The probes are LIMIT 0; they parse, they don't sample.
  • Dialect-specific feature drift (a function got dialect-renamed). Use the compiler's snapshot tests for that.

Why LIMIT 0?

Every probe runs SELECT … LIMIT 0. The query planner type-checks identifiers and predicates but does no row work, so the cost is microseconds per probe — fine for a per-cube fan-out in CI. The trade-off is that purely runtime drift (e.g. an enum value that got dropped from a check constraint) won't surface here.

CLI

The package is library-first; a CLI lives in callers' deploy scripts where the connection / DSN / role are already known.

Status

Phase A: probe-by-fragment shape. Drift findings are accurate; performance is "fine for CI, not for runtime gates."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semql_validate_db-0.3.0.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semql_validate_db-0.3.0-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file semql_validate_db-0.3.0.tar.gz.

File metadata

  • Download URL: semql_validate_db-0.3.0.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for semql_validate_db-0.3.0.tar.gz
Algorithm Hash digest
SHA256 350fe29b5b7dc01f9472751f81ae627e9e2b3c13a4348e87f5d0b0d651c72150
MD5 b4b694a118d105ed7eac3366218cc959
BLAKE2b-256 a5f680e0c62ccf2680c416c0ac0a2253bc0670737850f0920e24c2a5bcf10685

See more details on using hashes here.

File details

Details for the file semql_validate_db-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: semql_validate_db-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for semql_validate_db-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3c298c4f67b2c05951829cbf431bc78f4cfd021d57c1bfa9fadb39dc3e49e018
MD5 b37461329c4de3ef6c80ba6580cc76a7
BLAKE2b-256 64f2cc8a122368ecd4c448034b7c8dd778fd95051abe08c381c64a5e87b5ec5a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page