Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_core-0.1.49.tar.gz (18.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_core-0.1.49-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_core-0.1.49-cp312-cp312-macosx_11_0_arm64.whl (28.6 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_core-0.1.49-cp312-cp312-macosx_10_13_x86_64.whl (29.0 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_core-0.1.49-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_core-0.1.49-cp311-cp311-macosx_11_0_arm64.whl (28.7 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_core-0.1.49-cp311-cp311-macosx_10_9_x86_64.whl (29.4 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_core-0.1.49-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (78.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_core-0.1.49-cp310-cp310-macosx_11_0_arm64.whl (28.8 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_core-0.1.49-cp310-cp310-macosx_10_9_x86_64.whl (29.4 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_core-0.1.49.tar.gz.

File metadata

  • Download URL: dvt_core-0.1.49.tar.gz
  • Upload date:
  • Size: 18.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dvt_core-0.1.49.tar.gz
Algorithm Hash digest
SHA256 05273e60476540f1c1309cb2910f7710ce0b78ab6fe0f80f647234407f52389e
MD5 87e087a9d74faf09c6b5f04d2e519dfc
BLAKE2b-256 8f0a86d21833e157e8cc531b2e188ffedacfb5ea1e2ba8df5632f46fb5eec0dc

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.49-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.49-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 554553d97d70b86b22e0047b2665798449ed355c6a125937856d6c3af264d6c1
MD5 fe3d75f2f832845c5ee5270b713f17e4
BLAKE2b-256 834a34eb3824ad8ba1f47e35a2a1e629551e3bc174bf9e75482966261e7ac120

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.49-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.49-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 42d04a305725326301b0f59bb8192b82ceb3297316fb1d6423614f953cf580e6
MD5 94b94bb46f1df436635484de9e543b03
BLAKE2b-256 860305f036acd66b0762bb6d9f687a467b9b2fb7c7e622f0e127452f201d3051

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.49-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.49-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 f6dc5f09e3753a5d8a675ee1343beebe6ba60b10d6698c539f16c175f280ff1a
MD5 cf079c9d47c1c1e07bba5d90f9c07f91
BLAKE2b-256 1735f36a785dea7ab1c0e4f55e8135b68ba793e1fa836858ae45e398f56027c1

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.49-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.49-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 79e58d4bbc67bbef24d973f1eee905f8aabcd36e3ac588afe16543fd0ed138d8
MD5 480480e5543f941c27f94d741ddd2a32
BLAKE2b-256 a94c4ad2ec75cb148d48cb898e5427eb157930abbaa30e489687a214eff1703b

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.49-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.49-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1ec697d523286abd5f12a88f7767594788416a6a69876e4d154b599eaf5fc89b
MD5 2771a29edc961c4a4c6859bd78813a20
BLAKE2b-256 34a127cdaeb0adaff8b9324552e24e41b3a3062eaf48c35d4523304abace6ade

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.49-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.49-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 666411d719fa6eb4af317bdf9ef218202dbd1e8f15f40bd72639a0c423dc7031
MD5 115c855ffe7583dca5ff1ce46bc82a25
BLAKE2b-256 54cd0657a3a4d8dfff6d68b9c050b68b2f1a3a035f38d774180b035baa812474

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.49-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.49-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 55c205665cea82afd8896284a275f9b70ee291b63dc9e0d61ea504ce70fc0965
MD5 92b52b06291663baef4ea391c16c50b1
BLAKE2b-256 8b300f0815cd4b4b9563ec9ecc033175ada2394b2a2d2efbc0111326efabd27f

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.49-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.49-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 820452447b23c1765a7876f6bfd401b289856685da4a66e35409103c99b0d866
MD5 eacd0e7863bbdaa9c04ad7708e85130d
BLAKE2b-256 cbbb971e03a5ef150512b8803970cbad8bfa6a71a523d01ba00176876f03a15a

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.49-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.49-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 b66ab931519ab141cc483682bf9d81e6f30c120a2c69c13e71a2bad206b05261
MD5 4a837993ce6a0485f36309e2317b27db
BLAKE2b-256 92945d8e1b2c3cacabccc4075a143dbfe804f603eb767998db8b97ddebb66d72

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page