DuckDB semantic view loader and query compiler
Project description
semduck
Portable semantic view runtime for DuckDB, implemented in Python.
Full documentation lives in the GitHub Pages docs site for this repository. This README stays focused on the package-facing quickstart.
Supported baseline:
- Python
3.11to3.13 - DuckDB
1.4+ dbt-duckdb1.9.xfor the optional dbt plugin path
Install
pip install semduck
Install the dbt support extra only if you are registering the DuckDB plugin in a dbt-duckdb project:
pip install "semduck[dbt]"
Supported Patterns
- Standalone Python package and CLI:
- YAML semantic specs
- semantic DDL
- dbt interface through
dbt-semduck:- inline semantic DDL only
YAML is supported for standalone Python and CLI usage. The dbt interface deliberately uses DDL instead of YAML-in-dbt so semduck stays dbt-agnostic.
Quickstart
Start by initializing a registry. This creates a semantic schema and several supporting tables in the duckdb database you provide.
semduck init --db demo.duckdb
Load a semantic definition from YAML:
semduck load --db demo.duckdb --file orders_semantic.yaml
semduck compile --db demo.duckdb --request "orders_semantic dimensions region metrics total_revenue"
Load a semantic definition from DDL:
semduck load --db demo.duckdb --format ddl --file path/to/orders_semantic.sql
semduck query --db demo.duckdb --request "orders_semantic dimensions region metrics total_revenue"
The CLI accepts --format auto|yaml|ddl for check and load. In auto mode it uses the file extension or the first non-empty line to infer the format.
Configure an Ollama-backed ask provider and run the ask CLI:
semduck ask \
--db examples/dbt_example/jaffle_shop.duckdb \
--config packages/semduck/examples/ask_ollama_config.yaml \
--llm-log-dir .semduck/llm-logs \
--question "What is total revenue by customer name?" \
--sql --table
Configure an OpenAI-compatible local server and run the ask CLI:
semduck ask \
--db examples/dbt_example/jaffle_shop.duckdb \
--config packages/semduck/examples/ask_openai_compatible_config.yaml \
--llm-log-dir .semduck/llm-logs \
--question "What is total revenue by customer name?" \
--sql --table
Set llm.log_dir in the config file to enable persistent trace logging by default, or use --llm-log-dir to override it per run. Pass --no-llm-log to disable logging even when the config file specifies a directory.
semduck ask uses a planner stage to build the semantic request and, when --summary is requested, a separate summary stage to explain the executed rows. Advanced users can configure different models for those jobs with llm.tasks.ask_plan and llm.tasks.ask_summary. If task-specific config is used, the summary task only needs to be configured when summary output is requested.
During semduck ask, the CLI now prints one-line stage updates such as planning, compile, execution, and summarization to stderr. Final results remain on stdout, so JSON output stays machine-readable.
Start the semduck MCP server over stdio:
semduck mcp --db examples/dbt_example/jaffle_shop.duckdb
Learn More
The docs site covers:
- installation and quickstart
- YAML vs DDL authoring
- request language semantics
- CLI command reference
- Python API guidance
- dbt interface
- MCP server setup
askprovider configuration
Python API
The package exposes both YAML and DDL loaders:
import duckdb
from semduck import compile_request_sql, init_registry, load_semantic_ddl, load_semantic_yaml
conn = duckdb.connect("demo.duckdb")
init_registry(conn)
load_semantic_yaml(conn, """
name: sample
tables:
- name: orders
base_table:
table: orders
dimensions:
- name: region
expr: region
metrics:
- name: order_count
expr: count(order_id)
""")
load_semantic_ddl(conn, """
create semantic view replacement_sample as
table main.orders as orders
dimensions (
region as region
)
metrics (
count(order_id) as order_count
);
""")
sql = compile_request_sql(conn, "replacement_sample dimensions region metrics order_count")
print(sql)
Relevant API entry points:
init_registry(conn)load_semantic_yaml(conn, yaml_text)load_semantic_ddl(conn, ddl_text)load_semantic_yaml_file(conn, path)load_semantic_ddl_file(conn, path)compile_request_sql(conn, request)execute_request(conn, request)
dbt Boundary
semduck keeps dbt-specific behavior in the dbt-semduck package.
- Supported in dbt: inline semantic DDL compiled by dbt and then loaded into
semduck - Not supported in dbt: YAML specs containing unresolved
ref(...)orsource(...)
The design note for that boundary lives in _project/decisions/remove_yaml_in_dbt_support.md.
Repo Examples
- In-memory Python quickstart:
examples/quickstart.py - Query an existing database from Python:
examples/query_existing_db.py - Query an existing database from the CLI:
examples/query_existing_db_cli.sh - Use
ask_question(...)from Python:examples/ask_existing_db.py - Register an Ollama provider config and use
semduck ask:examples/ask_existing_db_cli.sh - Example Ollama provider config:
examples/ask_ollama_config.yaml - Example OpenAI-compatible local provider config:
examples/ask_openai_compatible_config.yaml - Start the MCP server over stdio:
examples/mcp_server_stdio.sh - Example MCP client config:
examples/mcp_client_config.json - MCP startup and client connection guide:
examples/mcp_connection_guide.md - End-to-end dbt example:
examples/dbt_example
Development
uv sync
uv run --group dev python -m pytest
uv run tox
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file semduck-0.1.1.tar.gz.
File metadata
- Download URL: semduck-0.1.1.tar.gz
- Upload date:
- Size: 74.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d8bc622b72ae82d6f4e253ebf61d8e85e401772229d475713c67d3044ceb7a8
|
|
| MD5 |
f61bd97dcd628064ceedc15cf3a6ca6f
|
|
| BLAKE2b-256 |
9927786cffe3f42d196c298ac3bc8f8f657a052ad3442b3ff1ee02bd73576abe
|
Provenance
The following attestation bundles were made for semduck-0.1.1.tar.gz:
Publisher:
pypi-publish.yml on carlsonjosh/semduck
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
semduck-0.1.1.tar.gz -
Subject digest:
2d8bc622b72ae82d6f4e253ebf61d8e85e401772229d475713c67d3044ceb7a8 - Sigstore transparency entry: 1449943159
- Sigstore integration time:
-
Permalink:
carlsonjosh/semduck@881c5fb61bdb9370b9fd0cd3c303846ddea22107 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/carlsonjosh
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@881c5fb61bdb9370b9fd0cd3c303846ddea22107 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file semduck-0.1.1-py3-none-any.whl.
File metadata
- Download URL: semduck-0.1.1-py3-none-any.whl
- Upload date:
- Size: 70.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
66402ca64d8100757c23859ec662dda3f50f6691e021f0c010b1fa49ebbe34eb
|
|
| MD5 |
94721d0cc27da40f9faa242b6c5f83ac
|
|
| BLAKE2b-256 |
01d584f34d25362d7e0c509329f8e9804e10971e243d8598683656d40ce9f5a6
|
Provenance
The following attestation bundles were made for semduck-0.1.1-py3-none-any.whl:
Publisher:
pypi-publish.yml on carlsonjosh/semduck
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
semduck-0.1.1-py3-none-any.whl -
Subject digest:
66402ca64d8100757c23859ec662dda3f50f6691e021f0c010b1fa49ebbe34eb - Sigstore transparency entry: 1449943881
- Sigstore integration time:
-
Permalink:
carlsonjosh/semduck@881c5fb61bdb9370b9fd0cd3c303846ddea22107 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/carlsonjosh
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@881c5fb61bdb9370b9fd0cd3c303846ddea22107 -
Trigger Event:
workflow_dispatch
-
Statement type: