Skip to main content

Semantic data layer: SemanticQuery → backend SQL with authorisation, row-level scoping, time-spine fill, and a typed four-role LLM prompt pipeline.

Project description

semql

Pure-Python compiler from a semantic spec to backend SQL. Define cubes (dimensions, measures, time-dimensions, joins) once; emit correct, parameterised SQL for Postgres, ClickHouse, DuckDB, Snowflake, BigQuery, and the analytics engines Redshift, Trino and Databricks.

SQL Server, MySQL and Oracle ship as experimental / opt-in dialects: sqlglot transpiles their date_trunc / percentile to best-effort forms that aren't exercised in CI, so enable them deliberately by passing experimental_dialects() through the compiler's dialects= override and verify the SQL on a live instance. (Gap-filling time-spines — fill_nulls — are not yet implemented for any of the six new dialects.)

semql does no I/O: catalogs are Python data; the compiler returns SQL + bound params; running the SQL is the caller's job. Sibling packages add LLM-planner prompt fragments (semql-prompt), MCP exposure (semql-mcp) and ER diagrams (semql-erd).

Install

pip install semql

Quick start

from semql import (
    Dialect,
    Catalog,
    Cube,
    Dimension,
    Measure,
    SemanticQuery,
)

orders = Cube(
    name="orders",
    dialect=Dialect.POSTGRES,
    table="orders",
    alias="o",
    measures=[
        Measure(name="revenue", sql="{o}.amount", agg="sum", unit="currency"),
    ],
    dimensions=[
        Dimension(name="region", sql="{o}.region", type="string"),
    ],
)

catalog = Catalog([orders])
compiled = catalog.compile(
    SemanticQuery(measures=["orders.revenue"], dimensions=["orders.region"]),
)
# compiled.sql, compiled.params, compiled.columns, compiled.dialect

The {o} placeholder in a cube's sql is its alias; the compiler resolves it (along with {schema}-style context placeholders and {ctx.X} row-level-security placeholders) at compile time.

What lives in the box

Surface Module
Cube / Measure / Dimension / TimeDimension / Join semql.model
SemanticQuery / Filter / TimeWindow / CompareWindow semql.spec
Catalog wrapper (validation, compile entry) semql.catalog
Compiler — sqlglot AST → dialect SQL semql.compile
Collect-all static validator semql.validate
Reflection cubes (catalog_cubes, ...) semql.introspect
Planner / router prompt fragments semql-prompt (sibling package)
Dialect strategies + sqlglot dialect adapter semql.backend, semql.dialect
Visualisation decision (chart type, axes, formats) semql.visualize
is_read_only_statement post-hoc SQL guard semql.safe
Structured error hierarchy semql.errors

Features

  • Compare windowsCompareWindow(mode="previous_period") wraps the inner query in current / prior CTEs joined via FULL OUTER JOIN and emits {m}_current / {m}_prior / {m}_delta / {m}_pct_change columns per measure.
  • Temporal model — time dimensions group by second / minute / hour / day / week / month / quarter / year; a type="date" time dimension drops sub-day grain and timezone shifts; per-cube timezone makes date_trunc tenant-correct and transpiles per dialect (AT TIME ZONE, CONVERT_TIMEZONE, ClickHouse's native arg, …); per-cube week_start (monday default, or sunday) sets the week bucket boundary consistently across dialects.
  • Explicit raw SQL — every hand-written fragment (Measure.sql, Join.on, Cube.base_predicate, …) is wrapped in a RawSQL marker at validation: when raw SQL is used, the model says so.
  • Tenancy — per-cube NONE (default; honestly unscoped), SCHEMA ({tenant_schema} substituted from the identity's tenant, required when declared) or DISCRIMINATOR (compiler wraps the source in a subquery with one bound WHERE predicate per tenancy_columns entry, so composite tenant keys need no workaround). tenant is first-class on AuthContext; Catalog(strict_tenancy=True) rejects any cube left with no isolation, scope, or required_roles.
  • Row-level securityCube.security_sql AND-composes with tenancy inside the isolation subquery; {ctx.X} placeholders bind as parameters, never inline as literals.
  • MCP-readybuild_planner_prompt_fragment(catalog.as_dict()) (in semql-prompt) produces the planner system-prompt fragment; semql-mcp wraps it as a server.
  • Pluggable backendsDialectStrategy Protocol lets out-of-tree Snowflake / BigQuery adapters slot in without forking the compiler.

Philosophy

See PHILOSOPHY.md at the repo root. Highlights:

  • The emitted SQL must be readable by the engineer debugging a production incident at 2am.
  • Compile errors beat runtime errors; runtime errors beat wrong results.
  • compile() fails at the first problem; validate() collects them all.
  • Catalogs are data; the META cubes expose the catalog through the same compiler path a normal query takes.

Status

Pre-v1. The shape is stable, but minor names / fields may move before the v1 contract locks. Tests pin every public behaviour the README documents.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semql-0.3.0.tar.gz (197.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semql-0.3.0-py3-none-any.whl (216.2 kB view details)

Uploaded Python 3

File details

Details for the file semql-0.3.0.tar.gz.

File metadata

  • Download URL: semql-0.3.0.tar.gz
  • Upload date:
  • Size: 197.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for semql-0.3.0.tar.gz
Algorithm Hash digest
SHA256 68beb8cddf64625255c602589102e485d2474fbaa853b778e70fb1fbc1c96609
MD5 bc183aad0d51a93a8e6e7b719d86a7bb
BLAKE2b-256 75cbcf5431bf94f53055b01026a6348af74ed26a125ad49138ef76b3dfc25ce0

See more details on using hashes here.

File details

Details for the file semql-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: semql-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 216.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for semql-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dbdb8be206d9372f514916b47ae5f6dd348cd17a205de7c9d77332d4c9135d90
MD5 0a6eb71407d07fa4dbda6c1eaffb8901
BLAKE2b-256 debaeac5766ec180b78f5735bfbb40f1d9053122b03a50481c71f6f4bc32a197

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page