Skip to main content

DuckDB semantic view loader and query compiler

Project description

semduck

Portable semantic view runtime for DuckDB, implemented in Python.

Full documentation lives in the GitHub Pages docs site for this repository. This README stays focused on the package-facing quickstart.

Supported baseline:

  • Python 3.11 to 3.13
  • DuckDB 1.4+
  • dbt-duckdb 1.9.x for the optional dbt plugin path

Install

pip install semduck

Install the dbt support extra only if you are registering the DuckDB plugin in a dbt-duckdb project:

pip install "semduck[dbt]"

Supported Patterns

  • Standalone Python package and CLI:
    • YAML semantic specs
    • semantic DDL
  • dbt interface through dbt-semduck:
    • inline semantic DDL only

YAML is supported for standalone Python and CLI usage. The dbt interface deliberately uses DDL instead of YAML-in-dbt so semduck stays dbt-agnostic.

Quickstart

Start by initializing a registry. This creates a semantic schema and several supporting tables in the duckdb database you provide.

semduck init --db demo.duckdb

Load a semantic definition from YAML:

semduck load --db demo.duckdb --file orders_semantic.yaml
semduck compile --db demo.duckdb --request "orders_semantic dimensions region metrics total_revenue"

Load a semantic definition from DDL:

semduck load --db demo.duckdb --format ddl --file path/to/orders_semantic.sql
semduck query --db demo.duckdb --request "orders_semantic dimensions region metrics total_revenue"

The CLI accepts --format auto|yaml|ddl for check and load. In auto mode it uses the file extension or the first non-empty line to infer the format.

Configure an Ollama-backed ask provider and run the ask CLI:

semduck ask \
  --db examples/dbt_example/jaffle_shop.duckdb \
  --config packages/semduck/examples/ask_ollama_config.yaml \
  --llm-log-dir .semduck/llm-logs \
  --question "What is total revenue by customer name?" \
  --sql --table

Configure an OpenAI-compatible local server and run the ask CLI:

semduck ask \
  --db examples/dbt_example/jaffle_shop.duckdb \
  --config packages/semduck/examples/ask_openai_compatible_config.yaml \
  --llm-log-dir .semduck/llm-logs \
  --question "What is total revenue by customer name?" \
  --sql --table

Set llm.log_dir in the config file to enable persistent trace logging by default, or use --llm-log-dir to override it per run. Pass --no-llm-log to disable logging even when the config file specifies a directory.

semduck ask uses a planner stage to build the semantic request and, when --summary is requested, a separate summary stage to explain the executed rows. Advanced users can configure different models for those jobs with llm.tasks.ask_plan and llm.tasks.ask_summary. If task-specific config is used, the summary task only needs to be configured when summary output is requested.

During semduck ask, the CLI now prints one-line stage updates such as planning, compile, execution, and summarization to stderr. Final results remain on stdout, so JSON output stays machine-readable.

Start the semduck MCP server over stdio:

semduck mcp --db examples/dbt_example/jaffle_shop.duckdb

Learn More

The docs site covers:

  • installation and quickstart
  • YAML vs DDL authoring
  • request language semantics
  • CLI command reference
  • Python API guidance
  • dbt interface
  • MCP server setup
  • ask provider configuration

Python API

The package exposes both YAML and DDL loaders:

import duckdb
from semduck import compile_request_sql, init_registry, load_semantic_ddl, load_semantic_yaml

conn = duckdb.connect("demo.duckdb")
init_registry(conn)

load_semantic_yaml(conn, """
name: sample
tables:
  - name: orders
    base_table:
      table: orders
    dimensions:
      - name: region
        expr: region
    metrics:
      - name: order_count
        expr: count(order_id)
""")

load_semantic_ddl(conn, """
create semantic view replacement_sample as
table main.orders as orders
  dimensions (
    region as region
  )
  metrics (
    count(order_id) as order_count
  );
""")

sql = compile_request_sql(conn, "replacement_sample dimensions region metrics order_count")
print(sql)

Relevant API entry points:

  • init_registry(conn)
  • load_semantic_yaml(conn, yaml_text)
  • load_semantic_ddl(conn, ddl_text)
  • load_semantic_yaml_file(conn, path)
  • load_semantic_ddl_file(conn, path)
  • compile_request_sql(conn, request)
  • execute_request(conn, request)

dbt Boundary

semduck keeps dbt-specific behavior in the dbt-semduck package.

  • Supported in dbt: inline semantic DDL compiled by dbt and then loaded into semduck
  • Not supported in dbt: YAML specs containing unresolved ref(...) or source(...)

The design note for that boundary lives in _project/decisions/remove_yaml_in_dbt_support.md.

Repo Examples

Development

uv sync
uv run --group dev python -m pytest
uv run tox

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semduck-0.1.1.tar.gz (74.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semduck-0.1.1-py3-none-any.whl (70.5 kB view details)

Uploaded Python 3

File details

Details for the file semduck-0.1.1.tar.gz.

File metadata

  • Download URL: semduck-0.1.1.tar.gz
  • Upload date:
  • Size: 74.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for semduck-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2d8bc622b72ae82d6f4e253ebf61d8e85e401772229d475713c67d3044ceb7a8
MD5 f61bd97dcd628064ceedc15cf3a6ca6f
BLAKE2b-256 9927786cffe3f42d196c298ac3bc8f8f657a052ad3442b3ff1ee02bd73576abe

See more details on using hashes here.

Provenance

The following attestation bundles were made for semduck-0.1.1.tar.gz:

Publisher: pypi-publish.yml on carlsonjosh/semduck

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file semduck-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: semduck-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 70.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for semduck-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 66402ca64d8100757c23859ec662dda3f50f6691e021f0c010b1fa49ebbe34eb
MD5 94721d0cc27da40f9faa242b6c5f83ac
BLAKE2b-256 01d584f34d25362d7e0c509329f8e9804e10971e243d8598683656d40ce9f5a6

See more details on using hashes here.

Provenance

The following attestation bundles were made for semduck-0.1.1-py3-none-any.whl:

Publisher: pypi-publish.yml on carlsonjosh/semduck

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page