Skip to main content

DuckDB-native runtime for building reproducible warehouse datasets

Project description

DBPort

CI PyPI version Python 3.11–3.12 License: Apache 2.0

Governance and orchestration for recomputable warehouse datasets.

You build models that produce datasets — and those datasets depend on each other. When external sources update, you need to recompute downstream models in the right order, knowing exactly which input versions went into each output. As the number of models grows, keeping track of dependencies, provenance, and data quality becomes harder than the modeling itself.

DBPort is the orchestration layer on top of your warehouse that enforces governance into recomputable workflows. It tracks dependencies between your models and on external inputs, so you can build with the confidence that future updates will be picked up correctly — and that other models can pick up your results.

Quickstart

pip install dbport
# Initialize a project
dbp init regional_trends --agency wifor --dataset emp__regional_trends
cd regional_trends

# Configure schema, inputs, and columns
dbp config model wifor.emp__regional_trends schema sql/create_output.sql
dbp config model wifor.emp__regional_trends input estat.nama_10r_3empers

# Run the full lifecycle: load inputs → execute model → publish output
dbp model run --version 2026-03-09 --timing

For programmatic control, the same workflow in Python:

from dbport import DBPort

with DBPort(agency="wifor", dataset_id="emp__regional_trends") as port:
    port.schema("sql/create_output.sql")
    port.load("estat.nama_10r_3empers", filters={"wstatus": "EMP"})
    port.execute("sql/transform.sql")
    port.publish(version="2026-03-09", params={"wstatus": "EMP"})

Why DBPort

  • Dependency tracking — models produce datasets that feed other models. DBPort tracks these dependencies so you always know what depends on what across your organisation.
  • Input provenance — every publish records exactly which input versions and snapshots were used. Trace any output back to the data that produced it.
  • Recompute on change — snapshot-cached inputs detect when external sources update. Unchanged tables are skipped — only what's new gets reprocessed.
  • Schema drift detection — declare the output shape upfront. Drift is caught before anything is written to the warehouse, not after.
  • Versioned, resumable publishes — each publish records version, parameters, and row count. Interrupted runs resume from checkpoint. Re-running a completed version is a safe no-op.
  • Committable statedbport.lock is TOML, credential-free, and safe to commit. It tracks schema, inputs, and version history for code review and CI.

Configuration

DBPort reads credentials from environment variables:

export ICEBERG_REST_URI=https://catalog.example.com
export ICEBERG_CATALOG_TOKEN=your-token
export ICEBERG_WAREHOUSE=your-warehouse

See the credentials guide for all options.

Documentation

Full docs at knifflig.github.io/dbport

Contributing

See CONTRIBUTING.md for development setup and guidelines.

License

Apache License 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbport-0.1.0.tar.gz (70.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbport-0.1.0-py3-none-any.whl (92.9 kB view details)

Uploaded Python 3

File details

Details for the file dbport-0.1.0.tar.gz.

File metadata

  • Download URL: dbport-0.1.0.tar.gz
  • Upload date:
  • Size: 70.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dbport-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1f0557231f44b131991b80d1c62b57d819b3006be51b7c76b51829ee401777bc
MD5 ffebd8c1c8e4a5b8fca7b1afaf082d06
BLAKE2b-256 12fc3563738f1fc9d7b1acd28bffae32371173af1cebff3ab779b8ea49e5a9e0

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbport-0.1.0.tar.gz:

Publisher: release.yml on knifflig/dbport

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dbport-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dbport-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 92.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dbport-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 464b68bd0a3112f8f53426a6d7735f305ca2eeffd745867819fd12106f7d7c7f
MD5 d80fd39a8e8a18bc96e3124f9cf8a219
BLAKE2b-256 73e15b536c55985ffa8d8b3066e052c887318bf6ae8bef5e519e0e2482b0b4a3

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbport-0.1.0-py3-none-any.whl:

Publisher: release.yml on knifflig/dbport

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page