DuckDB-native runtime for building reproducible warehouse datasets
Project description
DBPort
Governance and orchestration for recomputable warehouse datasets.
You build models that produce datasets — and those datasets depend on each other. When external sources update, you need to recompute downstream models in the right order, knowing exactly which input versions went into each output. As the number of models grows, keeping track of dependencies, provenance, and data quality becomes harder than the modeling itself.
DBPort is the orchestration layer on top of your warehouse that enforces governance into recomputable workflows. It tracks dependencies between your models and on external inputs, so you can build with the confidence that future updates will be picked up correctly — and that other models can pick up your results.
Quickstart
pip install dbport
# Initialize a project
dbp init regional_trends --agency wifor --dataset emp__regional_trends
cd regional_trends
# Configure schema, inputs, and columns
dbp config model wifor.emp__regional_trends schema sql/create_output.sql
dbp config model wifor.emp__regional_trends input estat.nama_10r_3empers
# Run the full lifecycle: load inputs → execute model → publish output
dbp model run --version 2026-03-09 --timing
For programmatic control, the same workflow in Python:
from dbport import DBPort
with DBPort(agency="wifor", dataset_id="emp__regional_trends") as port:
port.schema("sql/create_output.sql")
port.load("estat.nama_10r_3empers", filters={"wstatus": "EMP"})
port.execute("sql/transform.sql")
port.publish(version="2026-03-09", params={"wstatus": "EMP"})
Why DBPort
- Dependency tracking — models produce datasets that feed other models. DBPort tracks these dependencies so you always know what depends on what across your organisation.
- Input provenance — every publish records exactly which input versions and snapshots were used. Trace any output back to the data that produced it.
- Recompute on change — snapshot-cached inputs detect when external sources update. Unchanged tables are skipped — only what's new gets reprocessed.
- Schema drift detection — declare the output shape upfront. Drift is caught before anything is written to the warehouse, not after.
- Versioned, resumable publishes — each publish records version, parameters, and row count. Interrupted runs resume from checkpoint. Re-running a completed version is a safe no-op.
- Committable state —
dbport.lockis TOML, credential-free, and safe to commit. It tracks schema, inputs, and version history for code review and CI.
Configuration
DBPort reads credentials from environment variables:
export ICEBERG_REST_URI=https://catalog.example.com
export ICEBERG_CATALOG_TOKEN=your-token
export ICEBERG_WAREHOUSE=your-warehouse
See the credentials guide for all options.
Documentation
Full docs at knifflig.github.io/dbport
- About DBPort — why it exists and who it's for
- Getting Started — installation, credentials, first run
- Concepts — inputs, outputs, metadata, lock file, hooks, versioning
- CLI Reference —
dbpcommand reference - Python API —
DBPortclass reference - Examples — complete CLI and Python workflows
Contributing
See CONTRIBUTING.md for development setup and guidelines.
License
Apache License 2.0 — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbport-0.1.0.tar.gz.
File metadata
- Download URL: dbport-0.1.0.tar.gz
- Upload date:
- Size: 70.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f0557231f44b131991b80d1c62b57d819b3006be51b7c76b51829ee401777bc
|
|
| MD5 |
ffebd8c1c8e4a5b8fca7b1afaf082d06
|
|
| BLAKE2b-256 |
12fc3563738f1fc9d7b1acd28bffae32371173af1cebff3ab779b8ea49e5a9e0
|
Provenance
The following attestation bundles were made for dbport-0.1.0.tar.gz:
Publisher:
release.yml on knifflig/dbport
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dbport-0.1.0.tar.gz -
Subject digest:
1f0557231f44b131991b80d1c62b57d819b3006be51b7c76b51829ee401777bc - Sigstore transparency entry: 1123135220
- Sigstore integration time:
-
Permalink:
knifflig/dbport@b74a2a36924cd2645b2e39094323f5d4a0d09eac -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/knifflig
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b74a2a36924cd2645b2e39094323f5d4a0d09eac -
Trigger Event:
push
-
Statement type:
File details
Details for the file dbport-0.1.0-py3-none-any.whl.
File metadata
- Download URL: dbport-0.1.0-py3-none-any.whl
- Upload date:
- Size: 92.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
464b68bd0a3112f8f53426a6d7735f305ca2eeffd745867819fd12106f7d7c7f
|
|
| MD5 |
d80fd39a8e8a18bc96e3124f9cf8a219
|
|
| BLAKE2b-256 |
73e15b536c55985ffa8d8b3066e052c887318bf6ae8bef5e519e0e2482b0b4a3
|
Provenance
The following attestation bundles were made for dbport-0.1.0-py3-none-any.whl:
Publisher:
release.yml on knifflig/dbport
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dbport-0.1.0-py3-none-any.whl -
Subject digest:
464b68bd0a3112f8f53426a6d7735f305ca2eeffd745867819fd12106f7d7c7f - Sigstore transparency entry: 1123135241
- Sigstore integration time:
-
Permalink:
knifflig/dbport@b74a2a36924cd2645b2e39094323f5d4a0d09eac -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/knifflig
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b74a2a36924cd2645b2e39094323f5d4a0d09eac -
Trigger Event:
push
-
Statement type: