Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_ce-0.1.40.tar.gz (19.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_ce-0.1.40-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.40-cp312-cp312-macosx_11_0_arm64.whl (28.8 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_ce-0.1.40-cp312-cp312-macosx_10_13_x86_64.whl (29.2 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_ce-0.1.40-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.40-cp311-cp311-macosx_11_0_arm64.whl (28.9 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_ce-0.1.40-cp311-cp311-macosx_10_9_x86_64.whl (29.5 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_ce-0.1.40-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (78.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.40-cp310-cp310-macosx_11_0_arm64.whl (28.9 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_ce-0.1.40-cp310-cp310-macosx_10_9_x86_64.whl (29.6 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_ce-0.1.40.tar.gz.

File metadata

  • Download URL: dvt_ce-0.1.40.tar.gz
  • Upload date:
  • Size: 19.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dvt_ce-0.1.40.tar.gz
Algorithm Hash digest
SHA256 924025e0de51effa49680a93f024fa2046b9e071e2076fddc518b2de634e78d8
MD5 24eee4985c7608758acfd8470852322c
BLAKE2b-256 4c9404c762a0c0c9ac18d5543a06302a19b0092cb44047cda9925f26f8f903c1

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.40-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.40-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 722eec315dc5f7c932762fd86f3850284d0f19bdbb4f02291027f2a5ad6b617e
MD5 4de2e6d1f63a668c6b3e8ebdd312c94f
BLAKE2b-256 f34e34fd372b1ef2ae108662a355ea37f0e5240cb620a4a730938054d11cb3f2

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.40-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.40-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bf532ecadff90c63f9b2dce998fdf782a12badb3e400256a9db0012fe27adce1
MD5 64ccb56e2db3f157d9f475c52c791a63
BLAKE2b-256 ac317f592922ca96c8b208f1550c93625f7b1a4ea74f5a67bf4c6fd0df10545c

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.40-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.40-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 c6ba5edfb53126366a4e5cfa65bcfbc7f76b3d7584cfdcce93f62bcb91f8945c
MD5 c3304961d2ef5c918721fe2572222f36
BLAKE2b-256 2ed4e7e21aedbd84d5bec0fe00d2caadda2337c8ed5eea231168558e80da244e

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.40-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.40-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c07471c01ee40a716720da26ae8b9cab736cac86359333b2cc3b0506c7f9edad
MD5 4692a33593608d9b82d1e6b379fffdd9
BLAKE2b-256 931fea56e13b203fa54d635ceaa2c8daeccaee9ef9a6ca36261c6efb2acdcf7a

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.40-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.40-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cfc4029444374c1fec18fd60b054c516cc828ed49c24fb412e5b937551874eb8
MD5 b8e391b7d79c012bda23c11ab09eaf2c
BLAKE2b-256 507213eb9e360f0df7367ef05f8d125bcea76ff3b5e3566279908b58157186da

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.40-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.40-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 654f266e9ccf2ed20263b885dd19586af888c900a9e26bcf42d69d4efaffa447
MD5 66a61ccd7eb57dacddce8591cfb7663d
BLAKE2b-256 83e57761d46bf557025be465578a59c366bb9640900f22659c4a28005a609262

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.40-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.40-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5e93fcaff42fd0fc3ec214555c5a6753480b7e3a388b9cb8a9db61830cb9c996
MD5 ba071112b4bfd5bfb75d585d1b094ba7
BLAKE2b-256 f44d1c649645a65a46d79d4bfea446bf34bff88151c16241a8537af5916a808b

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.40-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.40-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 149d9210cd06d4a9c604f1d01ac95bc8e05e0b63b67a30d58cd2cc516baaf83f
MD5 0b59b6f36263a91e04e6644faa97038f
BLAKE2b-256 5466f967e1583d65d6e1b000e52890ce89fd68e240744dc6b36c4cceca9e6035

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.40-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.40-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 5f4f318b87359c96b87bc8d80214041635be3b71a10c6011e71c625de738b76e
MD5 cf90489d6d34c3127ebb0f112e330ddf
BLAKE2b-256 3ac5ce34596f564ec7e80edca95af20da6f9e2bf0e53747848cd1d3c12e1a977

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page