DVT — cross-engine data transformation tool with DuckDB federation.
Project description
DVT — Data Virtualization Tool
Connect every database. Transform across engines. Materialize anywhere.
DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.
No custom connectors. No complex config. Just SQL.
How It Works
DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:
flowchart LR
subgraph Sources
PG[(PostgreSQL)]
MY[(MySQL)]
SF[(Snowflake)]
OR[(Oracle)]
end
subgraph DVT["DVT Engine"]
direction TB
SLING1[/"Sling Extract"/]
DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
SQL["Model SQL<br/>(joins, transforms)"]
SLING2[/"Sling Load"/]
SLING1 --> DUCK --> SQL --> SLING2
end
subgraph Targets
TGT1[(Snowflake)]
TGT2[(Databricks)]
TGT3[(PostgreSQL)]
end
PG --> SLING1
MY --> SLING1
SF --> SLING1
OR --> SLING1
SLING2 --> TGT1
SLING2 --> TGT2
SLING2 --> TGT3
style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
style DUCK fill:#FFF000,stroke:#333,color:#333
style SLING1 fill:#0094b3,stroke:#333,color:#fff
style SLING2 fill:#0094b3,stroke:#333,color:#fff
style SQL fill:#29B5E8,stroke:#333,color:#fff
Two Execution Paths
| Path | When | How |
|---|---|---|
| Pushdown | Source and target on same engine | SQL runs directly on the database via adapter — identical to dbt |
| Extraction | Sources on different engines | Sling extracts → DuckDB joins → Sling loads to target |
The user never thinks about this — DVT decides the path automatically.
Supported Engines
13 engines in one package (dvt-adapters):
| Engine | Type | Engine | Type | ||
|---|---|---|---|---|---|
| 🐘 | PostgreSQL | OLTP | ❄️ | Snowflake | Cloud DW |
| 🐬 | MySQL | OLTP | 🧱 | Databricks | Cloud DW |
| 🦭 | MariaDB | OLTP | 🔷 | BigQuery | Cloud DW |
| 🟥 | SQL Server | OLTP | 🟧 | Redshift | Cloud DW |
| 🔴 | Oracle | OLTP | 🦆 | DuckDB | Embedded |
| ⚡ | Spark | Distributed | 🔵 | Fabric | Cloud DW |
| MySQL 5 | Legacy |
Any source → Any target. DVT handles the data movement.
Installation
pip install dvt-ce
Or with uv (recommended):
uv add dvt-ce
This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.
Then bootstrap your environment:
dvt sync # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs
Quick Start
dvt init my_project && cd my_project # Scaffold project
dvt sync # Install everything
dvt debug # Test all connections
dvt seed # Load CSV seed data
dvt run # Run all models
dvt docs generate && dvt docs serve # Engine-colored lineage docs
Configuration
Connections (~/.dvt/profiles.yml)
my_project:
target: pg_dev
outputs:
pg_dev:
type: postgres
host: localhost
port: 5432
user: analyst
password: secret
dbname: warehouse
schema: public
sf_prod:
type: snowflake
account: my-account
user: loader
password: secret
database: ANALYTICS
schema: PUBLIC
warehouse: COMPUTE_WH
mysql_crm:
type: mysql
host: mysql.example.com
port: 3306
user: reader
password: secret
database: crm
Sources (models/sources.yml)
The connection: field maps sources to their engine:
sources:
- name: app_db # On default target (no connection: needed)
schema: public
tables:
- name: users
- name: orders
- name: crm # On MySQL
connection: mysql_crm
schema: crm
tables:
- name: customers
- name: marketing # On Snowflake
connection: sf_prod
schema: PUBLIC
tables:
- name: campaigns
Cross-Engine Model
-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}
SELECT
u.user_id,
u.email,
c.customer_name,
m.campaign_name
FROM {{ source('app_db', 'users') }} u -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c -- MySQL
ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
ON u.user_id = m.user_id
DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.
Incremental Models
{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}
SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}
DVT reads the watermark from the target, extracts only new rows, appends them.
Two Dialects, One Project
| Path | You Write | Runs On |
|---|---|---|
| Pushdown | Target's native SQL (Snowflake SQL, T-SQL, etc.) | Target database |
| Extraction | DuckDB SQL (Postgres-like) | Local DuckDB cache |
Both coexist naturally. The dialect is determined by the execution path, not config.
Commands
Core
| Command | Description |
|---|---|
dvt run |
Execute models against targets |
dvt run --full-refresh |
Rebuild everything from scratch |
dvt run --select +model_name |
Run model and all ancestors |
dvt build |
Seeds + models + snapshots + tests in DAG order |
dvt seed |
Load CSVs via Sling (10-100x faster than dbt) |
dvt test |
Run data tests |
dvt compile |
Compile SQL without executing |
DVT-Specific
| Command | Description |
|---|---|
dvt sync |
Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs) |
dvt debug |
Test all connections with clean status output |
dvt show --select model |
Query locally via DuckDB (no target needed) |
dvt retract |
Drop models from targets in reverse DAG order |
dvt retract --select +model |
Drop a model and its entire upstream chain |
dvt clean |
Remove build artifacts + DuckDB cache |
Documentation
| Command | Description |
|---|---|
dvt docs generate |
Cross-engine catalog with engine-colored lineage |
dvt docs serve |
Serve documentation website |
The docs UI features:
- Engine-colored nodes (each database has its brand color)
- Connection badges on every source and model
- Native column types from each engine
- Target and engine info in detail panels
DuckDB Cache
DVT maintains a persistent cache at .dvt/cache.duckdb:
- Source tables:
{source}__{table}— shared across models, reused between runs - Model results:
__model__{name}— for incremental{{ this }}references dvt run --full-refreshrebuilds the cachedvt cleandeletes.dvt/entirely
--target Philosophy
--target switches environments, not engines:
dvt run --target dev_snowflake # Dev Snowflake
dvt run --target prod_snowflake # Prod Snowflake ← Same engine, different env
Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.
dbt Compatibility
All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.
Community
- Discord: Join the DVT community
- Issues: Report a bug
Links
| PyPI | dvt-ce · dvt-adapters |
| GitHub | dvt-ce · dvt-adapters |
Built On
DVT stands on the shoulders of three exceptional open-source projects:
| Project | Role in DVT | License |
|---|---|---|
| dbt-core | DAG orchestration, SQL models, Jinja, testing, docs, adapters | Apache 2.0 |
| Sling | High-performance data movement across 30+ connectors (free tier) | Apache 2.0 |
| DuckDB | Local analytics engine — extraction compute, caching, dvt show |
MIT |
We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.
License
DVT is licensed under the Apache License 2.0.
Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.
Built by Hesham Badawi — data engineer, for data engineers.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dvt_ce-0.1.40.tar.gz.
File metadata
- Download URL: dvt_ce-0.1.40.tar.gz
- Upload date:
- Size: 19.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
924025e0de51effa49680a93f024fa2046b9e071e2076fddc518b2de634e78d8
|
|
| MD5 |
24eee4985c7608758acfd8470852322c
|
|
| BLAKE2b-256 |
4c9404c762a0c0c9ac18d5543a06302a19b0092cb44047cda9925f26f8f903c1
|
File details
Details for the file dvt_ce-0.1.40-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: dvt_ce-0.1.40-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 82.7 MB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
722eec315dc5f7c932762fd86f3850284d0f19bdbb4f02291027f2a5ad6b617e
|
|
| MD5 |
4de2e6d1f63a668c6b3e8ebdd312c94f
|
|
| BLAKE2b-256 |
f34e34fd372b1ef2ae108662a355ea37f0e5240cb620a4a730938054d11cb3f2
|
File details
Details for the file dvt_ce-0.1.40-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: dvt_ce-0.1.40-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 28.8 MB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf532ecadff90c63f9b2dce998fdf782a12badb3e400256a9db0012fe27adce1
|
|
| MD5 |
64ccb56e2db3f157d9f475c52c791a63
|
|
| BLAKE2b-256 |
ac317f592922ca96c8b208f1550c93625f7b1a4ea74f5a67bf4c6fd0df10545c
|
File details
Details for the file dvt_ce-0.1.40-cp312-cp312-macosx_10_13_x86_64.whl.
File metadata
- Download URL: dvt_ce-0.1.40-cp312-cp312-macosx_10_13_x86_64.whl
- Upload date:
- Size: 29.2 MB
- Tags: CPython 3.12, macOS 10.13+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6ba5edfb53126366a4e5cfa65bcfbc7f76b3d7584cfdcce93f62bcb91f8945c
|
|
| MD5 |
c3304961d2ef5c918721fe2572222f36
|
|
| BLAKE2b-256 |
2ed4e7e21aedbd84d5bec0fe00d2caadda2337c8ed5eea231168558e80da244e
|
File details
Details for the file dvt_ce-0.1.40-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: dvt_ce-0.1.40-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 82.1 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c07471c01ee40a716720da26ae8b9cab736cac86359333b2cc3b0506c7f9edad
|
|
| MD5 |
4692a33593608d9b82d1e6b379fffdd9
|
|
| BLAKE2b-256 |
931fea56e13b203fa54d635ceaa2c8daeccaee9ef9a6ca36261c6efb2acdcf7a
|
File details
Details for the file dvt_ce-0.1.40-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: dvt_ce-0.1.40-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 28.9 MB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cfc4029444374c1fec18fd60b054c516cc828ed49c24fb412e5b937551874eb8
|
|
| MD5 |
b8e391b7d79c012bda23c11ab09eaf2c
|
|
| BLAKE2b-256 |
507213eb9e360f0df7367ef05f8d125bcea76ff3b5e3566279908b58157186da
|
File details
Details for the file dvt_ce-0.1.40-cp311-cp311-macosx_10_9_x86_64.whl.
File metadata
- Download URL: dvt_ce-0.1.40-cp311-cp311-macosx_10_9_x86_64.whl
- Upload date:
- Size: 29.5 MB
- Tags: CPython 3.11, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
654f266e9ccf2ed20263b885dd19586af888c900a9e26bcf42d69d4efaffa447
|
|
| MD5 |
66a61ccd7eb57dacddce8591cfb7663d
|
|
| BLAKE2b-256 |
83e57761d46bf557025be465578a59c366bb9640900f22659c4a28005a609262
|
File details
Details for the file dvt_ce-0.1.40-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: dvt_ce-0.1.40-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 78.9 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e93fcaff42fd0fc3ec214555c5a6753480b7e3a388b9cb8a9db61830cb9c996
|
|
| MD5 |
ba071112b4bfd5bfb75d585d1b094ba7
|
|
| BLAKE2b-256 |
f44d1c649645a65a46d79d4bfea446bf34bff88151c16241a8537af5916a808b
|
File details
Details for the file dvt_ce-0.1.40-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: dvt_ce-0.1.40-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 28.9 MB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
149d9210cd06d4a9c604f1d01ac95bc8e05e0b63b67a30d58cd2cc516baaf83f
|
|
| MD5 |
0b59b6f36263a91e04e6644faa97038f
|
|
| BLAKE2b-256 |
5466f967e1583d65d6e1b000e52890ce89fd68e240744dc6b36c4cceca9e6035
|
File details
Details for the file dvt_ce-0.1.40-cp310-cp310-macosx_10_9_x86_64.whl.
File metadata
- Download URL: dvt_ce-0.1.40-cp310-cp310-macosx_10_9_x86_64.whl
- Upload date:
- Size: 29.6 MB
- Tags: CPython 3.10, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f4f318b87359c96b87bc8d80214041635be3b71a10c6011e71c625de738b76e
|
|
| MD5 |
cf90489d6d34c3127ebb0f112e330ddf
|
|
| BLAKE2b-256 |
3ac5ce34596f564ec7e80edca95af20da6f9e2bf0e53747848cd1d3c12e1a977
|