Skip to main content

OrionBelt Semantic Layer - Compiles and executes YAML semantic models as analytical SQL

Project description

OrionBelt Semantic Layer logo — a stylized belt of three stars

OrionBelt Semantic Layer

Compile and execute YAML semantic models as analytical SQL across multiple database dialects

Live Demo Open In Colab

GitHub stars Version 2.6.0 PyPI Docker Hub Python 3.12+ License: BSL 1.1

FastAPI Pydantic v2 Gradio

BigQuery PostgreSQL Snowflake ClickHouse Dremio Databricks DuckDB MySQL

OrionBelt Semantic Layer is an API-first semantic engine and query planner for AI agents that compiles and executes declarative YAML model definitions as optimized SQL for BigQuery, ClickHouse, Databricks, Dremio, DuckDB/MotherDuck, MySQL, Postgres, and Snowflake. Query using business concepts — dimensions, measures, and metrics — instead of raw SQL.

Analytics as Code — Define your analytical semantics in version-controlled YAML, compile to dialect-specific SQL, and execute against live databases, all through a single API. No BI tool in the middle: the full loop from declarative model to query results is programmable, reviewable, and reproducible.

Companion Project: OrionBelt Analytics — an ontology-based MCP server that analyzes database schemas and generates RDF/OWL ontologies. Together they let AI assistants navigate your data landscape through ontologies and compile safe, dialect-aware analytical SQL.

Table of Contents


Try it in 30 Seconds

Option A: Live Demo (no install)

Open the Live Demo — Gradio UI with a pre-loaded example model. Paste a query, pick a dialect, see SQL instantly.

API explorer: Swagger UI | ReDoc

Want to try the PostgreSQL wire surface? Cloud Run is HTTPS-only, so the public demo can't expose ports 5432 (pgwire) or 8815 (Flight SQL). Spin the same demo up locally in two commands — it includes the baked-in orionbelt_1_commerce DuckDB dataset and the full OBSQL surface:

docker run --rm -d --name orionbelt-demo \
  -p 8080:8080 -p 5432:5432 -p 8815:8815 \
  -e PGWIRE_ENABLED=true \
  -e FLIGHT_ENABLED=true \
  ralforion/orionbelt-api:latest

# REST + Gradio UI:   http://localhost:8080/ui
# pgwire (any psql / DBeaver / Tableau / Power BI):
psql "host=localhost port=5432 user=obsl dbname=orionbelt_1_commerce sslmode=disable" \
  -c 'SELECT "Client Name", "Total Sales" LIMIT 5'
# Flight SQL smoke test:
uv run python examples/obsql.py 'SELECT "Client Name", "Total Sales" LIMIT 5'

docker stop orionbelt-demo

The container ships with PGWIRE_AUTH_MODE=trust (default), so it's safe for localhost but not safe to expose to the public internet — SCRAM/password auth is on the roadmap before that becomes the recommended pattern.

Option B: Google Colab (no install)

Open In Colab — Interactive notebook with TPC-H data: explore the model, compile queries across dialects, execute against DuckDB, and see results. Requires Python 3.12 runtime.

Option C: Install from PyPI

pip install orionbelt-semantic-layer

Then paste into a Python REPL:

from orionbelt.parser import ReferenceResolver, TrackedLoader
from orionbelt.compiler.pipeline import CompilationPipeline
from orionbelt.models.query import QueryObject, QuerySelect

model_yaml = """
version: "1.0"
dataObjects:
  Orders:
    code: ORDERS
    columns:
      Price: { code: PRICE, abstractType: float }
      Country: { code: COUNTRY, abstractType: string }
dimensions:
  Country:
    dataObject: Orders
    column: Country
    resultType: string
measures:
  Total Revenue:
    resultType: float
    aggregation: sum
    expression: "{[Orders].[Price]}"
"""

loader = TrackedLoader()
raw, source_map = loader.load_string(model_yaml)
resolver = ReferenceResolver()
model, result = resolver.resolve(raw, source_map)

query = QueryObject(select=QuerySelect(dimensions=["Country"], measures=["Total Revenue"]))
pipeline = CompilationPipeline()
output = pipeline.compile(query, model, "postgres")
print(output.sql)

Output:

SELECT
  "Orders"."COUNTRY" AS "Country",
  CAST(SUM("Orders"."PRICE") AS NUMERIC(18, 2)) AS "Total Revenue"
FROM ORDERS AS "Orders"
GROUP BY "Orders"."COUNTRY"

No env file needed — the compilation pipeline is stateless.

Start the servers:

orionbelt-api                              # REST API on :8000 (Swagger UI at /docs, Gradio UI at /ui)
orionbelt-ui                               # standalone Gradio UI on :7860 (connects to API on :8000)
FLIGHT_ENABLED=true orionbelt-api          # API + Arrow Flight SQL on :8815 (DBeaver, Tableau, Power BI)
PGWIRE_ENABLED=true orionbelt-api          # API + PostgreSQL wire on :5432 (Tableau, DBeaver, Superset, psql, Dremio source)

Option C2: Install with uv

uv pip install orionbelt-semantic-layer
uv run orionbelt-api                       # REST API on :8000 (Swagger UI at /docs, Gradio UI at /ui)
uv run orionbelt-ui                        # standalone Gradio UI on :7860 (connects to API on :8000)
FLIGHT_ENABLED=true uv run orionbelt-api   # API + Arrow Flight SQL on :8815 (DBeaver, Tableau, Power BI)
PGWIRE_ENABLED=true uv run orionbelt-api   # API + PostgreSQL wire on :5432 (Tableau, DBeaver, Superset, psql, Dremio source)

Smoke-test the Flight SQL surface without a BI tool:

uv run python examples/obsql.py 'SELECT version()'
uv run python examples/obsql.py 'SHOW TABLES'
uv run python examples/obsql.py 'SELECT "Region", "Total Sales" FROM sales LIMIT 5'

# Multi-model deployment? Pick the model with -m:
uv run python examples/obsql.py -m sales 'SHOW TABLES'
uv run python examples/obsql.py --list   # discover loaded models via REST

Try OBSQL in 30 seconds

OBSQL — OrionBelt Semantic QL — is the SQL surface BI tools and humans actually write. Bare labels, MEASURE() markers, or matching aggregate wrappers; aggregation-match validation; WITH ROLLUP / WITH CUBE; no escape hatch to raw warehouse SQL. Same language over Arrow Flight SQL (v2.4+) and PostgreSQL wire (v2.5+):

PGWIRE_ENABLED=true uv run orionbelt-api &

# Every BI tool already ships a Postgres ODBC/JDBC driver — point yours at :5432
psql "host=localhost port=5432 user=obsl dbname=sales sslmode=disable" \
  -c 'SELECT "Region", "Total Sales" LIMIT 5'

# All three measure forms compile to the same vendor SQL:
psql "..." -c 'SELECT "Region", "Total Sales"        FROM sales LIMIT 5'  -- bare
psql "..." -c 'SELECT "Region", MEASURE("Total Sales") FROM sales LIMIT 5'  -- explicit marker
psql "..." -c 'SELECT "Region", SUM("Total Sales")   FROM sales LIMIT 5'  -- matching aggregate

See the OBSQL reference for the full grammar.

Option D: Docker

Stage 1 — Zero-config start (models loaded later via API or UI):

docker run -p 8080:8080 ralforion/orionbelt-api

Open http://localhost:8080/docs to explore the API.

Stage 2 — Realistic setup with docker compose:

# docker-compose.yml
services:
  api:
    image: ralforion/orionbelt-api:2.6.0
    ports: ["8080:8080"]
    env_file: .env
    volumes:
      - ./models:/app/models:ro
    environment:
      MODEL_FILE: /app/models/my-model.obml.yml

  ui:
    image: ralforion/orionbelt-ui:2.6.0
    ports: ["7860:7860"]
    environment:
      API_BASE_URL: http://api:8080
docker compose up -d

See .env.template for the full environment variable reference.

Docker notes:

  • API_SERVER_HOST is already 0.0.0.0 inside the container — no override needed.
  • MCP via stdio does not work in Docker. Use the MCP HTTP client for containerized deployments.
  • Mount models to /app/models (or any path) and set MODEL_FILE to pre-load on startup.
  • For production, pin a version tag (:2.6.0) rather than :latest.

Claude Desktop / MCP

The MCP server is a separate thin client that delegates to the REST API:

orionbelt-semantic-layer-mcp

Add to your Claude Desktop claude_desktop_config.json:

{
  "mcpServers": {
    "orionbelt": {
      "command": "uvx",
      "args": ["orionbelt-semantic-layer-mcp"]
    }
  }
}

Also works with Copilot, Cursor, and Windsurf. See the MCP repo for full setup options.


Why OrionBelt?

OrionBelt dbt Semantic Layer Cube Malloy
Model format YAML-only (OBML) Python + YAML JavaScript Custom DSL
SQL generation AST-based (injection-safe) String templates String templates Compiler
Multi-dialect 8 dialects, no runtime lock-in dbt Cloud required Cube Cloud or self-host BigQuery-focused
Multi-fact queries Star Schema + CFL planner (fan-trap prevention) Limited Pre-aggregations Automatic joins
Integration surface REST API + MCP + Gradio UI dbt Cloud API REST + GraphQL VS Code extension
Deployment Self-host anywhere, single binary SaaS (Cloud) SaaS or self-host Library
License BSL 1.1 (converts to Apache 2.0) Apache 2.0 AGPL / proprietary MIT

Features

Semantic Modeling

  • OBML Format — YAML-based semantic models with data objects, dimensions, measures, metrics, and joins
  • Cross-Schema Queries — model data objects across multiple databases and schemas in a single model
  • Static Model Filters — mandatory WHERE conditions baked into the model, auto-applied with join extension
  • OBSL Graph & SPARQL — RDF graph export and read-only SPARQL querying for every loaded model
  • OSI Interoperability — bidirectional conversion between OBML and Open Semantic Interchange format

SQL Compilation

  • 8 SQL Dialects — BigQuery, ClickHouse, Databricks, Dremio, DuckDB/MotherDuck, MySQL, Postgres, Snowflake
  • AST-Based Generation — custom SQL AST ensures correct, injection-safe SQL (not string templates)
  • Star Schema & CFL — automatic join resolution with Composite Fact Layer for multi-fact queries
  • Data Types & Precision — automatic CAST wrapping with dialect-specific type rendering and precision clamping
  • Display Formatting — number format patterns (#,##0.00, 0.00%) on measures/metrics with locale-aware rendering
  • Timezone Settings — auto-detect database session timezone with defaultTimezone fallback and ISO 8601 serialization
  • sqlglot Validation — post-generation syntax check across all supported dialects

Integration Surface

  • REST API — FastAPI endpoints for model management, validation, compilation, and execution
  • MCP Serverseparate thin client for Claude, Copilot, Cursor, Windsurf
  • AI Integrations — LangChain, OpenAI Agents SDK, CrewAI, Google ADK, Vercel AI SDK, n8n, ChatGPT
  • Gradio UI — interactive web interface for model editing, query testing, and ER diagrams
  • DB-API 2.0 + Flight SQL — PEP 249 drivers and Arrow Flight SQL server for DBeaver, Tableau, Power BI; ships with examples/obsql.py, a tiny terminal CLI for testing the Flight surface without a BI tool
  • PostgreSQL Wire Protocol (v2.5.0+) — native Postgres-protocol surface on :5432. Every BI tool already ships a Postgres ODBC/JDBC driver, so the user side is "point your existing connection at OBSL and go" — Tableau, DBeaver, Superset, Power BI, plain psql, and Dremio as a federated Postgres source (Dremio → OBSL → optionally back to Dremio's lakehouse, full circle)

Agent-Facing API

  • Model Health on Load — every model load returns a health block with orphan dataObjects, fan-trap risks, and unreachable dimensions — agents skip the defensive second round trip
  • Query Plan EndpointPOST /query/plan returns the planner's understanding (planner choice, physical tables, join path, would_compile) without compiling SQL or executing; opt-in include_database_explain adds the warehouse's raw EXPLAIN
  • Structured Warnings — every warnings list across the API uses a stable {code, severity, message, path, hint, context} shape with a documented code taxonomy; agents branch on codes instead of parsing messages
  • Fuzzy /find Recovery — when a search produces no exact or synonym hits, deterministic Levenshtein + trigram fallback returns near-miss candidates with scores and reasons
  • Model Examples — optional OBML examples: block of canonical queries; GET /examples (with ?intent= filtering) gives agents one-round-trip discovery of what a model is designed to answer

Freshness-Driven Result Cache

  • Source-level freshness contracts — declare refresh: blocks on dataObject entries (interval / heartbeat / static); the cache derives query TTLs from the contracts of the physical tables a query touched, not from caller guesses
  • Heartbeat invalidation — one POST /v1/heartbeat to a physical table invalidates every cached query that depends on it, across every dataObject and session
  • DuckDB metadata + Parquet results — file-backed cache with type-precise serialization, lazy expiration, LRU capacity sweep; opt-in via CACHE_BACKEND=file
  • Inverts the Cube/dbt/Looker pattern — contracts live on the source, not the semantic abstraction; one source of truth across every cube/explore/saved query reading the table

Developer Experience

  • Source-Position Errors — validation errors report exact YAML line and column
  • ER Diagrams — interactive Mermaid diagrams with zoom and download (MD/PNG/Turtle)
  • Session Management — TTL-scoped sessions with thread-safe model isolation
  • JSON Schema — full OBML and query schema for IDE autocompletion (yaml-language-server)

Example

Define a Semantic Model (OBML)

# yaml-language-server: $schema=https://raw.githubusercontent.com/ralfbecher/orionbelt-semantic-layer/main/schema/obml-schema.json
version: "1.0"
dataObjects:
  Customers:
    code: CUSTOMERS
    database: WAREHOUSE
    schema: PUBLIC
    columns:
      Customer ID: { code: CUSTOMER_ID, abstractType: string }
      Country:     { code: COUNTRY, abstractType: string }

  Orders:
    code: ORDERS
    database: WAREHOUSE
    schema: PUBLIC
    columns:
      Order Customer ID: { code: CUSTOMER_ID, abstractType: string }
      Price:             { code: PRICE, abstractType: float }
      Quantity:          { code: QUANTITY, abstractType: int }
    joins:
      - joinType: many-to-one
        joinTo: Customers
        columnsFrom: [Order Customer ID]
        columnsTo: [Customer ID]

dimensions:
  Country:
    dataObject: Customers
    column: Country
    resultType: string

measures:
  Revenue:
    resultType: float
    aggregation: sum
    expression: "{[Orders].[Price]} * {[Orders].[Quantity]}"
    dataType: "decimal(18, 2)"

Compile via REST API

# Create a session
curl -s -X POST http://localhost:8080/v1/sessions | jq .session_id
# -> "a1b2c3d4"

# Load the model
curl -s -X POST http://localhost:8080/v1/sessions/a1b2c3d4/models \
  -H "Content-Type: application/json" \
  -d '{"model_yaml": "..."}' | jq .model_id
# -> "abcd1234"

# Compile a query
curl -s -X POST http://localhost:8080/v1/sessions/a1b2c3d4/query/sql \
  -H "Content-Type: application/json" \
  -d '{"model_id":"abcd1234","query":{"select":{"dimensions":["Country"],"measures":["Revenue"]}},"dialect":"postgres"}' \
  | jq -r .sql
Generated SQL (Postgres)
SELECT
  "Customers"."COUNTRY" AS "Country",
  CAST(SUM("Orders"."PRICE" * "Orders"."QUANTITY") AS NUMERIC(18, 2)) AS "Revenue"
FROM WAREHOUSE.PUBLIC.ORDERS AS "Orders"
LEFT JOIN WAREHOUSE.PUBLIC.CUSTOMERS AS "Customers"
  ON "Orders"."CUSTOMER_ID" = "Customers"."CUSTOMER_ID"
GROUP BY "Customers"."COUNTRY"

Change dialect to bigquery, clickhouse, databricks, dremio, duckdb, mysql, or snowflake for dialect-specific SQL.


Gradio UI

OrionBelt Gradio UI showing side-by-side OBML model editor and compiled SQL output

  • SQL Compiler — side-by-side OBML model and query editors with syntax highlighting, 8 dialect selector, one-click compilation with formatted SQL output and query explain
  • Query Execution — execute compiled queries against a connected database, view results with locale-aware number formatting, response metadata panel, TSV download and clipboard copy (requires QUERY_EXECUTE=true)
  • ER Diagram — interactive Mermaid ER diagram with zoom, column toggle, and download (MD/PNG/Turtle)
  • Ontology Graph — interactive vis-network visualization of the OBML graph (data objects, dimensions, measures, metrics, joins) with toggleable layers and adjustable node spacing
  • Editor Toolbar — clear, undo, redo, upload, download, and copy buttons on all code editors
  • OSI Import/Export — convert between OBML and OSI formats
  • Dark/Light Mode — toggle via header button, state persisted across sessions

OrionBelt Ontology Graph tab showing the semantic model as an interactive network of data objects, dimensions, measures, metrics, and join relationships

Embedded mode — the UI is mounted at /ui on the API server:

pip install orionbelt-semantic-layer && orionbelt-api
# -> UI at http://localhost:8000/ui

Standalone mode — run API and UI as separate processes:

orionbelt-api                                              # API on :8000
orionbelt-ui                                               # UI on :7860 (connects to API on :8000)
API_BASE_URL=http://remote-api:8080 orionbelt-ui           # point UI to a remote API

Documentation

Topic Link
Full docs site ralforion.com/orionbelt-semantic-layer
Installation getting-started/installation
Quick Start getting-started/quickstart
Docker & Deployment getting-started/docker
Development getting-started/development
OBML Model Format guide/model-format
Query Language guide/query-language
SQL Dialects guide/dialects
Period-over-Period Metrics guide/period-over-period
Trend Analysis (rank / lag / lead / ntile, partitioned MAs, statistical aggregates) guide/trend-analysis
Compilation Pipeline guide/compilation
OBSL Graph & SPARQL guide/obsl
Gradio UI guide/ui
AI Integrations guide/integrations
OSI Interoperability guide/osi
REST API Endpoints api/endpoints
DB-API Drivers & Flight SQL drivers
Architecture reference/architecture
Configuration reference/configuration
Sales Model Walkthrough examples/sales-model
Multi-Dialect Output examples/multi-dialect
Multi-Fact: Sales & Returns examples/multi-fact
TPC-DS Benchmark examples/tpcds
Quickstart Notebook examples/quickstart.ipynb
Comparison: Overview comparison/
Comparison: vs. dbt Semantic Layer comparison/dbt
Comparison: vs. Malloy comparison/malloy
Comparison: vs. LookML / Looker comparison/lookml
Comparison: vs. Cube comparison/cube
Comparison: vs. AtScale comparison/atscale

Status & Roadmap

Status Area
Shipped 8 SQL dialects, REST API, MCP server, Gradio UI, DB-API drivers, Flight SQL, PostgreSQL wire protocol (v2.5.0+) — Tableau / DBeaver / Superset / Power BI / psql / Dremio as a federated Postgres source, OBSL/SPARQL, OSI v0.2 interop (v2.6.0+) with bidirectional schema validation, AI integrations (LangChain, CrewAI, ADK, etc.), model inheritance & extends, data types & numerical precision, timezone settings, grain & filter context overrides, Trend Analysis (v2.6.0+) — partitioned rolling windows, MetricType.WINDOW for rank/lag/lead/ntile, 9 statistical aggregates (CORR, COVAR_, REGR_, STDDEV_, VAR_)
In progress Additional dialects, CLI tool, pgwire SCRAM/password auth (unified auth subsystem)
Planned Authentication & API tokens, CLI for automation & CI/CD, DDL view generation (CREATE VIEW from queries), additional BI tool integrations, pre-aggregation / materialization layer

Companion Project

OrionBelt Analytics

An ontology-based MCP server that analyzes relational database schemas and generates RDF/OWL ontologies. Together with OrionBelt Semantic Layer, it enables AI assistants to navigate your data landscape through ontologies and compile safe, dialect-aware analytical SQL.

Architecture diagram showing OrionBelt Analytics generating ontologies from database schemas, feeding into OrionBelt Semantic Layer for SQL compilation


Development

Contributing to OrionBelt or running from source:

git clone https://github.com/ralfbecher/orionbelt-semantic-layer.git
cd orionbelt-semantic-layer
uv sync                           # install all deps (dev, docs, ui, flight, drivers)
uv run orionbelt-api              # start API on :8000
# Quality
uv run pytest                     # run tests
uv run ruff check src/            # lint
uv run ruff format src/ tests/    # format
uv run mypy src/                  # type check

# Docs
uv sync --extra docs && uv run mkdocs serve  # docs on :8080

License

Copyright 2025 RALFORION d.o.o.

Licensed under the Business Source License 1.1. The Licensed Work will convert to Apache License 2.0 on 2030-03-16.

By contributing to this project, you agree to the Contributor License Agreement.


RALFORION d.o.o.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orionbelt_semantic_layer-2.6.0.tar.gz (8.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orionbelt_semantic_layer-2.6.0-py3-none-any.whl (523.5 kB view details)

Uploaded Python 3

File details

Details for the file orionbelt_semantic_layer-2.6.0.tar.gz.

File metadata

File hashes

Hashes for orionbelt_semantic_layer-2.6.0.tar.gz
Algorithm Hash digest
SHA256 a11da151d041accc967be4f245c97498158c9ed6b33d74c7db39a8de80df110d
MD5 d541791d43e7d5adb7c2e3b6931fd1ef
BLAKE2b-256 b5b0dc7f70198d7b7fe92820f4d93185fbae9ca73fa1b2a6182580bc85b98a6f

See more details on using hashes here.

File details

Details for the file orionbelt_semantic_layer-2.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for orionbelt_semantic_layer-2.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 01d7736f5eab47ee9e5d31f6caa62d52714721c0957e0244fd565633d02ad8d2
MD5 ec9d834f3513abb611d2b76ac67705d1
BLAKE2b-256 e5ff478d2f8dbd86235e82a4e6e453b8b215725d180579212adcbd033a9fd9cb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page