Declarative ETL framework for YAML-driven data pipelines
Project description
dpone
dpone is a Python ETL framework for declarative, YAML-driven data pipelines. It helps data teams describe sources, sinks, load strategies, dependencies, conventions, and operational checks as reusable configuration instead of one-off scripts.
The public package name, import name, GitHub repository name, and CLI name are all intentionally short: dpone.
Repository: https://github.com/PaulKov/dpone
Public install:
python -m pip install dpone
dpone --help
Documentation map
Start here if you are evaluating or operating dpone:
- Documentation index
- CLI reference
- Running pipelines from CLI and Python
- Connector overview
- Source -> sink matrix
- Manual integration matrix
- CI/CD
- Testing runbooks
- Runtime observability
- Object storage staging
- Supply-chain evidence
- Release evidence
- Load strategies
- Nested normalization
- Load lineage
- Type mapping matrix
- Schema evolution
- Production readiness
- Architecture
What dpone gives you
- YAML manifests for single-process and batch ETL definitions.
- Built-in DAG/dependency inspection for pipeline debugging.
- Runtime abstractions for sources, sinks, connectors, state, reconciliation, and safe SQL logging.
- Optional integrations for PostgreSQL, MSSQL/SQL Server, ClickHouse, BigQuery/GCS, Kafka, pandas, Google Ads, and HashiCorp Vault.
- A CLI designed for self-service validation, rendering, explainability, and documentation checks.
- Compatibility shims for older import paths while the canonical package layout continues to stabilize.
Installation
Install the core package from PyPI:
pip install dpone
dpone --version
dpone -v
Install common extras for local ETL development:
pip install "dpone[postgres,mssql,clickhouse,kafka,gcp,s3,azure,pandas,vault]"
Install everything currently published by the project:
pip install "dpone[full]"
With uv:
uv add dpone
uv add "dpone[full]"
Optional extras
| Extra | Purpose |
|---|---|
postgres |
PostgreSQL connectivity via psycopg |
mssql |
Microsoft SQL Server connectivity via pyodbc; production bulk paths use external ODBC Driver 18 and bcp |
clickhouse |
ClickHouse connectivity |
gcp |
Google BigQuery and Google Cloud Storage support |
s3 |
AWS S3 object storage staging support via boto3 |
azure |
Azure Blob Storage staging support via azure-storage-blob |
object_storage |
S3, GCS, and Azure Blob object storage staging support |
kafka |
Kafka batch source/sink support via confluent-kafka, Schema Registry codecs, Avro, JSON Schema, and Protobuf helpers |
pandas |
DataFrame-based extract/load helpers |
vault |
HashiCorp Vault integration via public vault-kv-client |
google_ads |
Google Ads API support |
full |
All public extras above |
Vault support uses vault-kv-client, published on PyPI as vault-kv-client. New code should use vault_kv_client; the historical vault_client import path remains supported by that package as a compatibility layer.
Credentials can come from environment variables, Airflow Connections, HashiCorp Vault-compatible KV, or inline params for smoke tests. See Connections and credentials for copy-paste examples for Postgres, MSSQL, ClickHouse, BigQuery, Kafka, and REST API.
Quick start
Create a batch manifest, for example examples/batch/landing_postgres_to_bq.batch.yaml:
# yaml-language-server: $schema=../../src/dpone/schema/etl-batch-manifest.schema.json
kind: dpone.batch.v1
convention: landing_raw_v1
registry: ../registry/sources.yaml
vars:
src_system: demo_source
src_database: demo_db
owner_team: data-platform
owner_contact: data-platform@example.com
sla: daily
defaults:
source:
type: postgres
connection_type: vault
connection_id: postgres-demo
vault_path: postgres/demo-source
options:
batch_size: 100000
export_format: csv
sink:
type: bigquery
connection_type: vault
connection_id: bigquery-demo
vault_path: gcp/demo-project-prod/bq/service-account
staging:
schema: stg
strategy:
mode: full_refresh
overwrite_type: exchange
schemas:
public:
tables:
- core_city
Validate and render it:
dpone manifest validate examples/batch/landing_postgres_to_bq.batch.yaml \
--profile landing_raw_v1 \
--registry examples/registry/sources.yaml
dpone manifest render examples/batch/landing_postgres_to_bq.batch.yaml \
--selector public.core_city \
--registry examples/registry/sources.yaml
Inspect pipeline dependencies:
dpone dag report examples/batch/landing_postgres_to_bq.batch.yaml \
--base-path . \
--format json \
--preset ci \
--registry examples/registry/sources.yaml
CLI overview
dpone --help
dpone manifest --help
dpone dag --help
dpone docs --help
Common commands:
dpone manifest list examples/batch/landing_postgres_to_bq.batch.yaml
dpone manifest validate examples/batch/landing_postgres_to_bq.batch.yaml --recursive
dpone manifest render examples/batch/landing_postgres_to_bq.batch.yaml --selector public.core_city
dpone manifest explain examples/batch/landing_postgres_to_bq.batch.yaml --selector public.core_city --why sink.table.schema
dpone dag list-edges examples/batch/landing_postgres_to_bq.batch.yaml --with-groups --with-refs
dpone dag explain-node examples/batch/landing_postgres_to_bq.batch.yaml --task public.core_city
dpone dag report examples/batch/landing_postgres_to_bq.batch.yaml --preset ci --format md
Repository layout
src/dpone/ Python package source code
docs/ User and developer documentation
examples/ Public example manifests and registries
tests/ Unit and integration tests
tools/ Local smoke and release helper scripts
Canonical imports live under:
dpone.manifest.*dpone.dag.*dpone.runtime.*dpone.contracts.*dpone.ports.*dpone.adapters.*
Legacy paths such as dpone.core.*, dpone.lib.*, dpone.source.*, and dpone.sink.* are compatibility shims. Prefer canonical imports for new code.
Local development
uv sync --all-extras
uv run ruff check .
uv run ruff format --check .
uv run mypy --config-file mypy.ini
uv run pytest -m "not integration_live"
Build package artifacts:
uv build
Run the package smoke script from an installed environment:
python tools/package_smoke.py --project-root . --dpone-cmd dpone
CI and releases
The OSS repository uses GitHub Actions as the primary automation path. See CI/CD for the workflow map, detailed runbooks, artifacts, and developer guidance.
Key workflows:
.github/workflows/ci.ymlruns linting, formatting checks, type checks, tests, coverage, package build, and PostgreSQL XMin integration..github/workflows/pages.ymlbuilds and deploys the GitHub Pages documentation site frommaster..github/workflows/release.ymlbuilds and publishes tagged releases to PyPI.- .github/workflows/integration-matrix.yml and .github/workflows/connector-certification.yml provide manual/scheduled production-confidence gates.
Release tags use the format vX.Y.Z, for example:
git tag -a vX.Y.Z -m "Release vX.Y.Z"
git push origin vX.Y.Z
Prefer PyPI Trusted Publishing for releases. Token-based publishing should only be used as a fallback with short-lived, scoped tokens.
Security
Never commit API tokens, PyPI tokens, GitHub tokens, Vault credentials, service-account JSON, or live vendor credentials. If a secret is ever pasted into an issue, chat, commit, or CI log, revoke it before publishing or pushing public history.
See Security policy for the vulnerability reporting process.
License
dpone is licensed under the Apache License 2.0. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dpone-0.5.0.tar.gz.
File metadata
- Download URL: dpone-0.5.0.tar.gz
- Upload date:
- Size: 795.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc8bc69c08a10cede9a05296d50d41f49cf96651cba7ec24abd48649362d75d6
|
|
| MD5 |
8ae8aa296785c85b4e16918102aeff33
|
|
| BLAKE2b-256 |
578461cde3ff826028952d2ef0faa7a2e5295831dde2f8a1bb97b7f5aca888c6
|
Provenance
The following attestation bundles were made for dpone-0.5.0.tar.gz:
Publisher:
release.yml on PaulKov/dpone
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dpone-0.5.0.tar.gz -
Subject digest:
dc8bc69c08a10cede9a05296d50d41f49cf96651cba7ec24abd48649362d75d6 - Sigstore transparency entry: 1751791469
- Sigstore integration time:
-
Permalink:
PaulKov/dpone@2ab4dcdb6ff8255ca59a52520ec4c67a0c0651a4 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/PaulKov
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@2ab4dcdb6ff8255ca59a52520ec4c67a0c0651a4 -
Trigger Event:
push
-
Statement type:
File details
Details for the file dpone-0.5.0-py3-none-any.whl.
File metadata
- Download URL: dpone-0.5.0-py3-none-any.whl
- Upload date:
- Size: 1.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
46b133cf4f160665622c4d43b52a13a2f6b4f17e298afe426e26620c9e641228
|
|
| MD5 |
8d3b1d36993ccceb75c2542ab954272c
|
|
| BLAKE2b-256 |
20bfd406b59d2c8ce8644c8be2eba4815273841c9c089f96d888a4bc28f71124
|
Provenance
The following attestation bundles were made for dpone-0.5.0-py3-none-any.whl:
Publisher:
release.yml on PaulKov/dpone
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dpone-0.5.0-py3-none-any.whl -
Subject digest:
46b133cf4f160665622c4d43b52a13a2f6b4f17e298afe426e26620c9e641228 - Sigstore transparency entry: 1751791485
- Sigstore integration time:
-
Permalink:
PaulKov/dpone@2ab4dcdb6ff8255ca59a52520ec4c67a0c0651a4 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/PaulKov
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@2ab4dcdb6ff8255ca59a52520ec4c67a0c0651a4 -
Trigger Event:
push
-
Statement type: