SQLGlot-based semantic layer with multi-format adapter support
Project description
Sidemantic
SQLGlot-based semantic layer with multi-format adapter support.
Features
- Simple API: Define metrics once, use them everywhere
- SQL query interface: Write familiar SQL that gets rewritten to use semantic layer
- Automatic joins: Define relationships, joins happen automatically via graph traversal
- Multi-format adapters: Import/export from Cube, MetricFlow (dbt), and native YAML
- Rich metric types: Aggregations, ratios, formulas, cumulative, time comparisons, conversions
- Auto-detected dependencies: No manual dependency declarations needed
- SQLGlot-powered: Dialect-agnostic SQL generation with transpilation support
- Multi-hop joins: Automatic 2+ hop join discovery with intermediate models
- Type-safe: Pydantic models with validation
Quick Start
Define your semantic layer (YAML)
# semantic_layer.yml
# yaml-language-server: $schema=./sidemantic-schema.json
models:
- name: orders
table: orders
primary_key: order_id
relationships:
- name: customer
type: many_to_one
foreign_key: customer_id
dimensions:
- name: status
type: categorical
sql: status
- name: order_date
type: time
sql: created_at
granularity: day
metrics:
- name: revenue
agg: sum
sql: amount
- name: order_count
agg: count
# Graph-level metrics (dependencies auto-detected!)
metrics:
- name: total_revenue
sql: orders.revenue
Query with SQL
from sidemantic import SemanticLayer
# Load semantic layer
layer = SemanticLayer.from_yaml("semantic_layer.yml")
# Query with familiar SQL - automatically rewritten
result = layer.sql("""
SELECT revenue, status
FROM orders
WHERE status = 'completed'
""")
df = result.fetchdf()
Alternative: Python API
from sidemantic import SemanticLayer, Model, Metric, Dimension, Relationship
layer = SemanticLayer()
orders = Model(
name="orders",
table="orders",
primary_key="order_id",
relationships=[
Relationship(name="customer", type="many_to_one", foreign_key="customer_id")
],
dimensions=[
Dimension(name="status", type="categorical", sql="status"),
Dimension(name="order_date", type="time", sql="created_at", granularity="day"),
],
metrics=[
Metric(name="revenue", agg="sum", sql="amount"),
Metric(name="order_count", agg="count"),
]
)
layer.add_model(orders)
# Programmatic query
result = layer.query(
metrics=["orders.revenue"],
dimensions=["orders.status"],
filters=["orders.status = 'completed'"]
)
df = result.fetchdf()
Editor Support
Generate JSON Schema for autocomplete in VS Code, IntelliJ, etc:
uv run python -m sidemantic.schema
Add to your YAML files:
# yaml-language-server: $schema=./sidemantic-schema.json
Adapters
Import
from sidemantic.adapters import CubeAdapter, MetricFlowAdapter, SidemanticAdapter
# From Cube
cube_adapter = CubeAdapter()
graph = cube_adapter.parse("cube_schema.yml")
# From MetricFlow (dbt)
mf_adapter = MetricFlowAdapter()
graph = mf_adapter.parse("semantic_models.yml")
# From native Sidemantic
native_adapter = SidemanticAdapter()
graph = native_adapter.parse("semantic_layer.yml")
Export
# Export to Cube
cube_adapter.export(sl.graph, "output_cube.yml")
# Export to MetricFlow
mf_adapter.export(sl.graph, "output_metricflow.yml")
# Export to native
sl.to_yaml("output_sidemantic.yml")
Full round-trip support: Sidemantic ↔ Cube ↔ MetricFlow
Advanced Features
Complex Metrics
Define ratios, formulas, cumulative metrics with automatic dependency detection:
models:
- name: orders
table: orders
primary_key: order_id
metrics:
# Model-level aggregations
- name: revenue
agg: sum
sql: amount
- name: completed_revenue
agg: sum
sql: amount
filters: ["status = 'completed'"]
# Graph-level metrics
metrics:
# Simple reference (dependencies auto-detected)
- name: total_revenue
sql: orders.revenue
# Ratio
- name: conversion_rate
type: ratio
numerator: orders.completed_revenue
denominator: orders.revenue
# Derived (dependencies auto-detected from formula!)
- name: profit_margin
type: derived
sql: "(revenue - cost) / revenue"
# Cumulative
- name: running_total
type: cumulative
sql: orders.revenue
window: "7 days"
Python alternative
Metric(name="total_revenue", sql="orders.revenue")
Metric(name="conversion_rate", type="ratio",
numerator="orders.completed_revenue",
denominator="orders.revenue")
Metric(name="profit_margin", type="derived",
sql="(revenue - cost) / revenue")
Metric(name="running_total", type="cumulative",
sql="orders.revenue", window="7 days")
Automatic Joins
Define relationships once, query across models:
models:
- name: orders
table: orders
primary_key: order_id
relationships:
- name: customer
type: many_to_one
foreign_key: customer_id
- name: customers
table: customers
primary_key: customer_id
relationships:
- name: region
type: many_to_one
foreign_key: region_id
Query spans 2 hops automatically:
# Automatically joins orders -> customers -> regions
result = layer.sql("""
SELECT orders.revenue, regions.region_name
FROM orders
""")
Relationship Types
Use explicit, readable relationship types:
- many_to_one: Many records in THIS table → one record in OTHER table (e.g., orders → customer)
- one_to_many: One record in THIS table → many records in OTHER table (e.g., customer → orders)
- one_to_one: One record in THIS table → one record in OTHER table (e.g., order → invoice)
Test Coverage
- 117 passing tests
- Real DuckDB integration
- SQL query rewriting
- Round-trip adapter tests
- Multi-hop join verification
- Formula parsing validation
- Automatic dependency detection
Run tests:
uv run pytest -v
Status
See docs/STATUS.md for detailed implementation status.
Completed:
- ✅ SQL query interface with automatic rewriting
- ✅ Core semantic layer with SQLGlot generation
- ✅ Relationship-based automatic joins (many_to_one, one_to_many, one_to_one)
- ✅ Multi-hop join discovery
- ✅ Derived metrics with automatic dependency detection
- ✅ Native YAML format with import/export
- ✅ Cube and MetricFlow adapters (import/export)
- ✅ DuckDB integration
- ✅ Unified metrics terminology (no more measures/metrics confusion!)
In Progress:
- ⚠️ Cumulative metrics (basic structure exists, needs subquery pattern)
Future:
- Query optimization
- Pre-aggregations/caching
- LookML adapter (requires grammar parser)
Examples
See examples/ directory:
sql_query_example.py- SQL query interface demonstrationbasic_example.py- Core usage patternsexport_example.py- Multi-format export demonstrationsidemantic/orders.yml- Native YAML examplecube/orders.yml- Cube format examplemetricflow/semantic_models.yml- MetricFlow format example
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sidemantic-0.2.1.tar.gz.
File metadata
- Download URL: sidemantic-0.2.1.tar.gz
- Upload date:
- Size: 68.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.30
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c48c92debfc6d32cd7341e2839cab87277e3e8928f4ccd3d6b7b84430e86185
|
|
| MD5 |
9b6b8dd3feec33e74402607fab9e42b4
|
|
| BLAKE2b-256 |
f5b4e834f6afe6f7351cb221a87dea5093f4b2f19c1203eb3079bcbe8a849399
|
File details
Details for the file sidemantic-0.2.1-py3-none-any.whl.
File metadata
- Download URL: sidemantic-0.2.1-py3-none-any.whl
- Upload date:
- Size: 77.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.30
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
053e831a2e877092f75b8769864fb7ef1bfb59add20b8964924004bf19c346ee
|
|
| MD5 |
b04767ae5b398fad83d70a5b51a53d2d
|
|
| BLAKE2b-256 |
f3fbdc32cc49d6c481642ce73b6f10bf49fc4fc9b24b87d32320e856792f8668
|