Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_core-0.1.48.tar.gz (18.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_core-0.1.48-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_core-0.1.48-cp312-cp312-macosx_11_0_arm64.whl (28.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_core-0.1.48-cp312-cp312-macosx_10_13_x86_64.whl (28.9 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_core-0.1.48-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (81.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_core-0.1.48-cp311-cp311-macosx_11_0_arm64.whl (28.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_core-0.1.48-cp311-cp311-macosx_10_9_x86_64.whl (29.2 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_core-0.1.48-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (78.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_core-0.1.48-cp310-cp310-macosx_11_0_arm64.whl (28.6 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_core-0.1.48-cp310-cp310-macosx_10_9_x86_64.whl (29.3 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_core-0.1.48.tar.gz.

File metadata

  • Download URL: dvt_core-0.1.48.tar.gz
  • Upload date:
  • Size: 18.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dvt_core-0.1.48.tar.gz
Algorithm Hash digest
SHA256 a4bbcd0ef01a7699c504e33458eb893ec6b6b393392dee9ce24d07d409d339e7
MD5 fc68c96914e9385f1576f5c1364d6514
BLAKE2b-256 1deea58cda773f6c8a9367024d4133abca0bd03737710393bfefa7ca2d4e63c8

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.48-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.48-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a3adfd129c9e272e201e445c0e3677f18a844ab531928da3048a5e18bb3a132f
MD5 1a2971c93cd583d391d825b8eb16f844
BLAKE2b-256 7afa27265611707e69649e66338c0f788d270be37d1a938a0cfe806f7f7b209e

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.48-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.48-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 49e2574a5c15c84ef097840278812cdf91a913bb18c4a04de1f5aa1c0ae61f64
MD5 8e014d7ef66e727eedd9c9d3c05a0273
BLAKE2b-256 0b2e9edac83fa760e000b071d3aed17d0fc94f57912600c4b7c07dacb0f8b60b

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.48-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.48-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 6af8543c7c89aef7f35021000f78e576ee65e8363d55733fe2fb96f8828d3eac
MD5 7587a50052d2901fa50a14c5d286425e
BLAKE2b-256 08f8f8048cf92f49eebbde00de79ad8a2ee9e3f200e57e65a79cd4ec00c4518b

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.48-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.48-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 afc9d9700dea2d1ea479d2c7fcf214a431be01469932050d99c574fd29a8075b
MD5 78802b6c4929568d67181e04e131c479
BLAKE2b-256 5e4b4a0c2752bf8fabac44a2309b8b5a993ca13a1e04a9c4f9a440646297135b

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.48-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.48-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c61a2cc65db309f0a30c4ea7f765eb41c4dac797d760a39177e5f47cb90fcc21
MD5 8aacd4bd324b73bde5c0ae186661f66a
BLAKE2b-256 ff53237812c786ce5cb23603f776242ca75daafe131e94abd1030877d84e3e89

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.48-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.48-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a3be10be87e868b14aecb1ca0bed763886435b8c0e29154ea6dd394f14d6ef9b
MD5 9d976dbb582122f861314c5b387788ef
BLAKE2b-256 a7aabe0fe532df24d8c8121de0075920cf8db7200dfd6b6b9bbe558d620f17cb

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.48-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.48-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e4c5344028586ac645b4f32bc46676cdc466125ce16b192b260f9e2102f3a459
MD5 736347bf676e1a5afa35d5ec57ad9f27
BLAKE2b-256 254d4856aa6e033f0d11107976ddbc73f42381f8b00e7ccd03601ac3314fff91

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.48-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.48-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9d4da17f500c92cfc9c91290e4b9abf3f76a188c81b626d24e8a2fbd71017bd2
MD5 52bbba52f9ec7a9c14362c50069e9f66
BLAKE2b-256 9ec4dc99d636a6b429bff56a15cdea764a30c2b1a094fbb1b0b84d8612679aeb

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.48-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.48-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 1b2fc942d6905528daf311794347e1081f1122be686f0890e817bb50d72638f3
MD5 e02f661f400ceb9df37b64a360d2063a
BLAKE2b-256 bbf6843b0daff1225330c7167b83d416f36ab3de6a3975851fab5160beedd7ed

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page