Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_ce-0.1.47.tar.gz (18.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_ce-0.1.47-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.47-cp312-cp312-macosx_11_0_arm64.whl (28.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_ce-0.1.47-cp312-cp312-macosx_10_13_x86_64.whl (28.9 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_ce-0.1.47-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (81.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.47-cp311-cp311-macosx_11_0_arm64.whl (28.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_ce-0.1.47-cp311-cp311-macosx_10_9_x86_64.whl (29.2 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_ce-0.1.47-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (78.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.47-cp310-cp310-macosx_11_0_arm64.whl (28.6 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_ce-0.1.47-cp310-cp310-macosx_10_9_x86_64.whl (29.3 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_ce-0.1.47.tar.gz.

File metadata

  • Download URL: dvt_ce-0.1.47.tar.gz
  • Upload date:
  • Size: 18.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dvt_ce-0.1.47.tar.gz
Algorithm Hash digest
SHA256 c24be3cf65b64f435eee7baef2eaaeeda04f3392b040baa55a22c33a17b840e9
MD5 0b5480d3a24090afc733fa3d3654f4f3
BLAKE2b-256 c2e80798c8abdac4c4e690742085b22d0ba6c239b1a43e654a5d88ab17a0374b

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.47-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.47-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 859c430e6ed7a9c66ba9215c091792900e1d83810ac1727ac5c5dcb5cd1fd77e
MD5 c091a584bd822d88a35313162b0f98d8
BLAKE2b-256 462016520ed5627e009d2baefa2e6201f3680b852059c8d7d412626a01fe985a

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.47-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.47-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 545539d10dda09af07e4edcf50632f669770746aefce4fe34e7f7a5587147c9a
MD5 0737b7e68f480518ff8bffe2737ed565
BLAKE2b-256 b9c9f1369ebc47e6e5b70a05d9a0574f600ed3fb68d9115fffdf10e75b01a7b8

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.47-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.47-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 52624dc6f7ae1598a281cc6b92abd42c60b98058d699e0913a0442a4a9dc2cd7
MD5 2262c0d0b9dd7179eb98de0d4899f178
BLAKE2b-256 48d459c3e06121f64493c2a46b7563a7f783d60acd5bf34887c7d380510a8946

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.47-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.47-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d19127ced1e8600f162b7ebd39db706d147d40c5906d8acd03f8f9422d0a3d57
MD5 f62ef027002706498bcfba88e4b83b22
BLAKE2b-256 ac4ab5a94a0243c0de9b4e3796869f95177fcc619429b1574ce20f8cb69761ad

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.47-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.47-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a85cd0bc714e7aac57adddd98a1378135af051e9da83fb4da8fd4f83999f820a
MD5 277c052b3c58568d71b34f2c107f3ec7
BLAKE2b-256 a27082f7267ae23d8d16dcefc10b37625ccbb40c2d212e9c385620c73776ccfc

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.47-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.47-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 45b487d84b79c98d1896b1c5200e62b713050f3609e870c71db66a7f8beacdaf
MD5 309d38391707c63d38ecb30bc2bce93a
BLAKE2b-256 39983f897d37974e72f892d442a2acb1b03171e9a7f6071a7788be06e95456dc

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.47-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.47-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5bf320f4e7692c223bfbf68dccfce7fa0d42eca7edd47e02693698fb0f6f778c
MD5 e1370a3ca4773a259ee13670c9f2a16b
BLAKE2b-256 658a370b5990227a5b808c82038e4e31cbbea5381f7f62b2722adf17bf2cb45d

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.47-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.47-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 76ba3f9f8ad2c42739be597a534c10cfebb9252d08764ba641a5dfc861f3f1d7
MD5 ccf44f203dfa4936050e1c60b33c4ea2
BLAKE2b-256 5790aea13296f5de7ab5c873c7c64c8b0e478c3e40665b0c01a4928522f2faee

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.47-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.47-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 384a520a2437f184a9c29e436566b03eadeebfe05e82fb8f741422e53858044f
MD5 07a247b9991219103473a999bc13c3bf
BLAKE2b-256 34baa765dfc211db096aa7eb55c5950fd48cde5ed9497ffec7765d8a6da851f2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page