Validate a semql Catalog against a live database — catches missing tables, dropped columns, and broken join predicates before a deploy.
Project description
semql-validate-db
Pre-deploy drift checker for semql catalogs. Runs cheap
probe queries against a live database and surfaces the class of bugs
the compiler can't see — missing tables, dropped columns, broken join
predicates, base-predicate drift.
semql is intentionally pure (PHILOSOPHY: "the compiler has no I/O").
That keeps the compiler simple, but it also means a catalog can pass
every compile-time check and still blow up at query time because
upstream renamed a column. semql-validate-db is the out-of-band
gate that catches it.
Install
pip install semql-validate-db
The package is driver-agnostic. Bring your own DB-API 2.0 connection:
pip install psycopg # Postgres
pip install clickhouse-connect # ClickHouse
pip install duckdb # DuckDB
Quick start
import duckdb
from semql import Backend, Catalog, Cube, Dimension, Measure, TimeDimension
from semql_validate_db import validate_against_db
orders = Cube(
name="orders",
backend=Backend.DUCKDB,
table="orders",
alias="o",
measures=[Measure(name="revenue", sql="{o}.amount", agg="sum")],
dimensions=[Dimension(name="region", sql="{o}.region", type="string")],
time_dimensions=[TimeDimension(name="created_at", sql="{o}.created_at")],
)
catalog = Catalog([orders])
conn = duckdb.connect(":memory:")
conn.execute(
"CREATE TABLE orders (amount DOUBLE, region TEXT, created_at TIMESTAMP)"
)
errors = validate_against_db(catalog, connection=conn)
for e in errors:
print(f"{e.code}: {e.cube}.{e.field or ''} — {e.message}")
A clean run returns an empty list. Drift (a missing column, a renamed
table) yields one DbValidationError per finding so a single run
gives the full picture instead of bailing on the first failure.
What it catches
missing_table—cube.tabledoesn't exist or the connection's role can't see it.missing_column— a measure / dimension / time-dimension SQL fragment references a column that no longer exists.base_predicate_invalid—cube.base_predicatedoesn't execute.join_predicate_invalid— aJoin.onpredicate references columns that aren't there, or compares incompatible types.required_filter_dimension_missing— static catalog check; the namedrequired_filtersentry has no matchingDimension.
What it doesn't catch
- Semantic drift (a column exists but means something different now). Schema is necessary, not sufficient.
- Cross-table referential integrity. The probes are
LIMIT 0; they parse, they don't sample. - Backend-specific feature drift (a function got dialect-renamed). Use the compiler's snapshot tests for that.
Why LIMIT 0?
Every probe runs SELECT … LIMIT 0. The query planner type-checks
identifiers and predicates but does no row work, so the cost is
microseconds per probe — fine for a per-cube fan-out in CI. The
trade-off is that purely runtime drift (e.g. an enum value that
got dropped from a check constraint) won't surface here.
CLI
The package is library-first; a CLI lives in callers' deploy scripts where the connection / DSN / role are already known.
Status
Phase A: probe-by-fragment shape. Drift findings are accurate; performance is "fine for CI, not for runtime gates."
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file semql_validate_db-0.2.1.tar.gz.
File metadata
- Download URL: semql_validate_db-0.2.1.tar.gz
- Upload date:
- Size: 7.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ae70cb27e81acd329c57b54120ad41878952d26262559aeb27f30f7c0a23324
|
|
| MD5 |
0a252e5550cfd24f67a1be1ff3d1e66d
|
|
| BLAKE2b-256 |
b8891c024bee07a0078d5215832f3b9bc4f3f65f91d478b3a038aae05b8ce348
|
File details
Details for the file semql_validate_db-0.2.1-py3-none-any.whl.
File metadata
- Download URL: semql_validate_db-0.2.1-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af2ae44044abf85f98f9015191e53d859b23dd336bea5da301d206e84a0b8040
|
|
| MD5 |
c1c2af7ca01c15260e192d7e54fc4b03
|
|
| BLAKE2b-256 |
dc57d1d859812bebaedfb5c50a944ba3d2afd2ffb6f397c75bf26ac175980987
|