Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_ce-0.1.42.tar.gz (18.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_ce-0.1.42-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.42-cp312-cp312-macosx_11_0_arm64.whl (28.7 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_ce-0.1.42-cp312-cp312-macosx_10_13_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_ce-0.1.42-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.42-cp311-cp311-macosx_11_0_arm64.whl (28.8 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_ce-0.1.42-cp311-cp311-macosx_10_9_x86_64.whl (29.5 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_ce-0.1.42-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (78.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.42-cp310-cp310-macosx_11_0_arm64.whl (28.9 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_ce-0.1.42-cp310-cp310-macosx_10_9_x86_64.whl (29.6 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_ce-0.1.42.tar.gz.

File metadata

  • Download URL: dvt_ce-0.1.42.tar.gz
  • Upload date:
  • Size: 18.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dvt_ce-0.1.42.tar.gz
Algorithm Hash digest
SHA256 8a75232cc4c14e6bc75b0508200f2a16567224ee724273ffcfa1758651519a10
MD5 de56d2beb5f44c0783c3627a057c1f64
BLAKE2b-256 20490e5033023afa1833179641bde43f80f88f5789bda1acadee99d50f6e0a5f

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.42-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.42-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1a61779fedc50ace59d3fa45f10e1bd31660456e4ef572a2341021eadae8a3d8
MD5 558d413ad31e45628b4ddbde0bd1f913
BLAKE2b-256 244d6a92a7fc1bfc1ca0be89aefbc66701fff5622fbc67c99820e463f1cf4c96

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.42-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.42-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a740cb9bd5daf9d9274cfc07d3efd88396ca9ad677995d84a20457436b3d3570
MD5 1e6e34bf192af59e6816650f81324dbd
BLAKE2b-256 13e2433c89c629c6db68caac8f3f191938039c96b0529397e7d34f8334434841

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.42-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.42-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 46377ca792c4c9c57cc4764ff4ed41495e76e6fed82ea125cbf0c3d8d73abf62
MD5 0aae33d710af54498e4116f3d43806dc
BLAKE2b-256 a6c309c3ab1ae1497f1e7148ce85f0ac606e65a37689e7714c2c75546b2a9eaa

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.42-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.42-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5470a855663daec711d7d6b0e5ca4d5fefc4fdfbc31060634346cff61666feb5
MD5 7b83d58d9af816248e08fa34e2db4d82
BLAKE2b-256 e9891723125e0c3e79c33529c7fc903ba28d2cc308a05c6783c6a7a8392ff843

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.42-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.42-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1f4b7bc5407a26a7af6a215af8ab31c220be493b4ba26b08472f0cf2359e5f33
MD5 7dd928e106b2ed0c0126211db4579d0c
BLAKE2b-256 f052931dd4efcc3ca4188eee2aa1ca76aa8d65de096c053c9cac664a74d81dc8

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.42-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.42-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0523e0f31ac04a72ad56c685e52e2ce1c68bb1f5f1b8a303776023aacfb18218
MD5 421e0464ccd9c18adda6c846e875af47
BLAKE2b-256 4d840cb3928a576f3724ae6f436c49ba2977d61e244074bce9c3e2bbc2ee52bc

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.42-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.42-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 85b9889815a223569f98b88daa10faa73ccdbe374c68f199a31e4d920cf218c1
MD5 53cd93384c2111efafe50369491f14f3
BLAKE2b-256 75eaab38f2c1f8bfd19a51a566b16a1e3ffa90a36d281a5ae8c0d35b1de72fef

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.42-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.42-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 10dc74f5bc247306bb7ed4434aa92ce9f27e8b4d44573e9ef9fbed973a1a1c5e
MD5 52d457d167de36268376c47b77b780e4
BLAKE2b-256 ce121eb3d037ec97ecd782ed4eb906cdd0392885dbf9dffc49c483c027e4b8bc

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.42-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.42-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 156d35596908c89f29b71b3f24e0175284144d7ced8b3ad82813fa0fd9052231
MD5 7932fd9c1a3cc67f88a5c88d5ecc14d8
BLAKE2b-256 717af5ec0dcd41e8a42f3b3897b2efcef05fa1e28301b88fba2ae16b2ee619b8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page