Skip to main content

Semantic data layer: SemanticQuery → backend SQL with authorisation, row-level scoping, time-spine fill, and a typed four-role LLM prompt pipeline.

Project description

semql

Pure-Python compiler from a semantic spec to backend SQL. Define cubes (dimensions, measures, time-dimensions, joins) once; emit correct, parameterised SQL for Postgres, ClickHouse, DuckDB, Snowflake, BigQuery, and the analytics engines Redshift, Trino and Databricks.

SQL Server, MySQL and Oracle ship as experimental / opt-in dialects: sqlglot transpiles their date_trunc / percentile to best-effort forms that aren't exercised in CI, so enable them deliberately by passing experimental_dialects() through the compiler's dialects= override and verify the SQL on a live instance. (Gap-filling time-spines — fill_nulls — are not yet implemented for any of the six new dialects.)

semql does no I/O: catalogs are Python data; the compiler returns SQL + bound params; running the SQL is the caller's job. Sibling packages add LLM-planner prompt fragments (semql-prompt), MCP exposure (semql-mcp) and ER diagrams (semql-erd).

Install

pip install semql

Quick start

from semql import (
    Dialect,
    Catalog,
    Cube,
    Dimension,
    Measure,
    SemanticQuery,
)

orders = Cube(
    name="orders",
    dialect=Dialect.POSTGRES,
    table="orders",
    alias="o",
    measures=[
        Measure(name="revenue", sql="{o}.amount", agg="sum", unit="currency"),
    ],
    dimensions=[
        Dimension(name="region", sql="{o}.region", type="string"),
    ],
)

catalog = Catalog([orders])
compiled = catalog.compile(
    SemanticQuery(measures=["orders.revenue"], dimensions=["orders.region"]),
)
# compiled.sql, compiled.params, compiled.columns, compiled.dialect

The {o} placeholder in a cube's sql is its alias; the compiler resolves it (along with {schema}-style context placeholders and {ctx.X} row-level-security placeholders) at compile time.

What lives in the box

Surface Module
Cube / Measure / Dimension / TimeDimension / Join semql.model
SemanticQuery / Filter / TimeWindow / CompareWindow semql.spec
Catalog wrapper (validation, compile entry) semql.catalog
Compiler — sqlglot AST → dialect SQL semql.compile
Collect-all static validator semql.validate
Reflection cubes (catalog_cubes, ...) semql.introspect
Planner / router prompt fragments semql-prompt (sibling package)
Dialect strategies + sqlglot dialect adapter semql.backend, semql.dialect
Visualisation decision (chart type, axes, formats) semql.visualize
is_read_only_statement post-hoc SQL guard semql.safe
Structured error hierarchy semql.errors

Features

  • Compare windowsCompareWindow(mode="previous_period") wraps the inner query in current / prior CTEs joined via FULL OUTER JOIN and emits {m}_current / {m}_prior / {m}_delta / {m}_pct_change columns per measure.
  • Temporal model — time dimensions group by second / minute / hour / day / week / month / quarter / year; a type="date" time dimension drops sub-day grain and timezone shifts; per-cube timezone makes date_trunc tenant-correct and transpiles per dialect (AT TIME ZONE, CONVERT_TIMEZONE, ClickHouse's native arg, …); per-cube week_start (monday default, or sunday) sets the week bucket boundary consistently across dialects.
  • Explicit raw SQL — every hand-written fragment (Measure.sql, Join.on, Cube.base_predicate, …) is wrapped in a RawSQL marker at validation: when raw SQL is used, the model says so.
  • Tenancy — per-cube NONE (default; honestly unscoped), SCHEMA ({tenant_schema} substituted from the identity's tenant, required when declared) or DISCRIMINATOR (compiler wraps the source in a subquery with one bound WHERE predicate per tenancy_columns entry, so composite tenant keys need no workaround). tenant is first-class on AuthContext; Catalog(strict_tenancy=True) rejects any cube left with no isolation, scope, or required_roles.
  • Row-level securityCube.security_sql AND-composes with tenancy inside the isolation subquery; {ctx.X} placeholders bind as parameters, never inline as literals.
  • MCP-readybuild_planner_prompt_fragment(catalog.as_dict()) (in semql-prompt) produces the planner system-prompt fragment; semql-mcp wraps it as a server.
  • Pluggable backendsDialectStrategy Protocol lets out-of-tree Snowflake / BigQuery adapters slot in without forking the compiler.

Philosophy

See PHILOSOPHY.md at the repo root. Highlights:

  • The emitted SQL must be readable by the engineer debugging a production incident at 2am.
  • Compile errors beat runtime errors; runtime errors beat wrong results.
  • compile() fails at the first problem; validate() collects them all.
  • Catalogs are data; the META cubes expose the catalog through the same compiler path a normal query takes.

Status

Pre-v1. The shape is stable, but minor names / fields may move before the v1 contract locks. Tests pin every public behaviour the README documents.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semql-0.4.0.tar.gz (208.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semql-0.4.0-py3-none-any.whl (227.0 kB view details)

Uploaded Python 3

File details

Details for the file semql-0.4.0.tar.gz.

File metadata

  • Download URL: semql-0.4.0.tar.gz
  • Upload date:
  • Size: 208.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for semql-0.4.0.tar.gz
Algorithm Hash digest
SHA256 5f43438376c01d19ee617837d952499dc0de743a39dcf31af024221da7f4d949
MD5 f4dd823e833d0441d05a14bcfbb8c98b
BLAKE2b-256 6ad8bfb8aacd7c04547e3e1581700e03b40a2812aaf6059c358e6a52f0a63192

See more details on using hashes here.

File details

Details for the file semql-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: semql-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 227.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for semql-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e5c64c99176eb5080330172cd7265c95f775bda74d196583bf956bbe9a79bbf2
MD5 11d5f4fac80724fbd2de280f3472cadf
BLAKE2b-256 388b20930939e1d7a0200165616db29a2c138d3015049634340ea19152176d33

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page