Skip to main content

Semantic data layer: SemanticQuery → backend SQL with authorisation, row-level scoping, time-spine fill, and a typed four-role LLM prompt pipeline.

Project description

semql

Pure-Python compiler from a semantic spec to backend SQL. Define cubes (dimensions, measures, time-dimensions, joins) once; emit correct, parameterised SQL for Postgres, ClickHouse, DuckDB, Snowflake, BigQuery, and the analytics engines Redshift, Trino and Databricks.

SQL Server, MySQL and Oracle ship as experimental / opt-in dialects: sqlglot transpiles their date_trunc / percentile to best-effort forms that aren't exercised in CI, so enable them deliberately by passing experimental_dialects() through the compiler's dialects= override and verify the SQL on a live instance. (Gap-filling time-spines — fill_nulls — are not yet implemented for any of the six new dialects.)

semql does no I/O: catalogs are Python data; the compiler returns SQL + bound params; running the SQL is the caller's job. Sibling packages add LLM-planner prompt fragments (semql-prompt), MCP exposure (semql-mcp) and ER diagrams (semql-erd).

Install

pip install semql

Quick start

from semql import (
    Dialect,
    Catalog,
    Cube,
    Dimension,
    Measure,
    SemanticQuery,
)

orders = Cube(
    name="orders",
    dialect=Dialect.POSTGRES,
    table="orders",
    alias="o",
    measures=[
        Measure(name="revenue", sql="{o}.amount", agg="sum", unit="currency"),
    ],
    dimensions=[
        Dimension(name="region", sql="{o}.region", type="string"),
    ],
)

catalog = Catalog([orders])
compiled = catalog.compile(
    SemanticQuery(measures=["orders.revenue"], dimensions=["orders.region"]),
)
# compiled.sql, compiled.params, compiled.columns, compiled.dialect

The {o} placeholder in a cube's sql is its alias; the compiler resolves it (along with {schema}-style context placeholders and {ctx.X} row-level-security placeholders) at compile time.

What lives in the box

Surface Module
Cube / Measure / Dimension / TimeDimension / Join semql.model
SemanticQuery / Filter / TimeWindow / CompareWindow semql.spec
Catalog wrapper (validation, compile entry) semql.catalog
Compiler — sqlglot AST → dialect SQL semql.compile
Collect-all static validator semql.validate
Reflection cubes (catalog_cubes, ...) semql.introspect
Planner / router prompt fragments semql-prompt (sibling package)
Dialect strategies + sqlglot dialect adapter semql.backend, semql.dialect
Visualisation decision (chart type, axes, formats) semql.visualize
is_read_only_statement post-hoc SQL guard semql.safe
Structured error hierarchy semql.errors

Features

  • Compare windowsCompareWindow(mode="previous_period") wraps the inner query in current / prior CTEs joined via FULL OUTER JOIN and emits {m}_current / {m}_prior / {m}_delta / {m}_pct_change columns per measure.
  • Temporal model — time dimensions group by second / minute / hour / day / week / month / quarter / year; a type="date" time dimension drops sub-day grain and timezone shifts; per-cube timezone makes date_trunc tenant-correct and transpiles per dialect (AT TIME ZONE, CONVERT_TIMEZONE, ClickHouse's native arg, …); per-cube week_start (monday default, or sunday) sets the week bucket boundary consistently across dialects.
  • Explicit raw SQL — every hand-written fragment (Measure.sql, Join.on, Cube.base_predicate, …) is wrapped in a RawSQL marker at validation: when raw SQL is used, the model says so.
  • Tenancy — per-cube NONE (default; honestly unscoped), SCHEMA ({tenant_schema} substituted from the identity's tenant, required when declared) or DISCRIMINATOR (compiler wraps the source in a subquery with one bound WHERE predicate per tenancy_columns entry, so composite tenant keys need no workaround). tenant is first-class on AuthContext; Catalog(strict_tenancy=True) rejects any cube left with no isolation, scope, or required_roles.
  • Row-level securityCube.security_sql AND-composes with tenancy inside the isolation subquery; {ctx.X} placeholders bind as parameters, never inline as literals.
  • MCP-readybuild_planner_prompt_fragment(catalog.as_dict()) (in semql-prompt) produces the planner system-prompt fragment; semql-mcp wraps it as a server.
  • Pluggable backendsDialectStrategy Protocol lets out-of-tree Snowflake / BigQuery adapters slot in without forking the compiler.

Philosophy

See PHILOSOPHY.md at the repo root. Highlights:

  • The emitted SQL must be readable by the engineer debugging a production incident at 2am.
  • Compile errors beat runtime errors; runtime errors beat wrong results.
  • compile() fails at the first problem; validate() collects them all.
  • Catalogs are data; the META cubes expose the catalog through the same compiler path a normal query takes.

Status

Pre-v1. The shape is stable, but minor names / fields may move before the v1 contract locks. Tests pin every public behaviour the README documents.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semql-0.5.0.tar.gz (214.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semql-0.5.0-py3-none-any.whl (233.3 kB view details)

Uploaded Python 3

File details

Details for the file semql-0.5.0.tar.gz.

File metadata

  • Download URL: semql-0.5.0.tar.gz
  • Upload date:
  • Size: 214.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for semql-0.5.0.tar.gz
Algorithm Hash digest
SHA256 c0115e126e3aea7644652c8fa76a2146efb852ab9605788fbc19fd6023463b7d
MD5 e3c0281162295a5e8018afcfa87d05b7
BLAKE2b-256 9a4e23a64195842cabcaa353ad387d2b9f1a8ee64beca0c3aea828f112ef5f1f

See more details on using hashes here.

File details

Details for the file semql-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: semql-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 233.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for semql-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2157ca3adbd7233a3885ee7791967bc5a9d79c3d8aca2a3f61adf9e68d75e7a4
MD5 2e65b1d8be34ecf9dea038369541bdc4
BLAKE2b-256 14c7f9653f7297bb48f08ae059e6e8e567e37cecfd8c7f5d242c24f1181612de

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page