Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_core-0.1.50.tar.gz (19.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_core-0.1.50-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (83.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_core-0.1.50-cp312-cp312-macosx_11_0_arm64.whl (28.9 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_core-0.1.50-cp312-cp312-macosx_10_13_x86_64.whl (29.3 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_core-0.1.50-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (83.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_core-0.1.50-cp311-cp311-macosx_11_0_arm64.whl (29.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_core-0.1.50-cp311-cp311-macosx_10_9_x86_64.whl (29.7 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_core-0.1.50-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (79.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_core-0.1.50-cp310-cp310-macosx_11_0_arm64.whl (29.1 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_core-0.1.50-cp310-cp310-macosx_10_9_x86_64.whl (29.7 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_core-0.1.50.tar.gz.

File metadata

  • Download URL: dvt_core-0.1.50.tar.gz
  • Upload date:
  • Size: 19.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dvt_core-0.1.50.tar.gz
Algorithm Hash digest
SHA256 2240e90258d39f8286904cbe63ebbeb25130818c0871e40a38d07d7a7bcf2a0e
MD5 b06ffc174a6266c340835ec13987bae9
BLAKE2b-256 a61eb415afa234bc23ee3cb293faf924a06b3706666979a6caf99037c56bb8a9

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.50-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.50-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 79cd2b4452513c399bb128995e9463c557d7291584b558bf61b1778ff735386a
MD5 789f13eef31d84153d97a93c0815acda
BLAKE2b-256 a5372acf133aa08acff2efc018dfd7b1bd42741ae470d695216d8656f9564f36

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.50-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.50-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 fb481dba8aeb691235e008a98e50bd976c9f265e1c7cc48dd4582c76a7ecd91f
MD5 265387fde0d19170fe18416de4b97e8f
BLAKE2b-256 dcad19256cbdd5dea0fa2e16b0e623fbc4ef31bdfd958969dbd45ec922de8293

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.50-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.50-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 49b6cefd1a381360f8ea7454186ef752383f1eb8eb125d4878955ad8b91a087f
MD5 78630fcd8edc29dbb6ccbd54a79727df
BLAKE2b-256 d97962f155f7f88146e384551c14127bfd4e8f56243375d817196d57b80522e2

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.50-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.50-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5f6840261000095645b220469a108673249a4f5d7e66ea93c8a3a5c1889553da
MD5 9587db9b0905e86def046dd752a8806a
BLAKE2b-256 7296223ed35990a468fd0730bef624b2a7caaa9cdf7cca7d424fe93a1f23fa09

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.50-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.50-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 21691763847e998a24ab09277b959eadfca0e78a34f050236b8e236439355c48
MD5 cb4132069e2f5bebf19c3e9d844af9ab
BLAKE2b-256 f2edef72fdb64821661685ea0e56f70477ba6882a06bf7e782d9cdb75afa5e0f

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.50-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.50-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d290b878291d664184578340d8b84f10d3ae5e51021e4193e596ca50cefb73d8
MD5 9531f6080708a75a65f68ead21a0ef2a
BLAKE2b-256 aaba9a211637e0a0bf64284431fbe3ae081c137c891229c3966c5c2b9cd1d62b

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.50-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.50-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9f84e07f5678373c74e507759505f17fb86ba90f3d6219396de54b0a07a1cc31
MD5 5a1c3ba69c012b21eb2191a694c3a233
BLAKE2b-256 679ee166850f4e58839c881c700e7e218bdf7473f1ba1d82c4e646d0ed357a91

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.50-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.50-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 252eb895ff839426a70487aa59aac06ccff5228751df50cb49fbc2a56ab584ae
MD5 77a93bbf2b9aca26243670398d833f2f
BLAKE2b-256 19c9b4557d3ae48eefc5b7f40b4d49546b9694acd77028e5bfb4675c9594079f

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.50-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.50-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 82783ebd3736938d4800e8c02d221b5c28aacaacf5aa5910b11d8300cd466de1
MD5 df56b4a7e8ad2f58ff6ed849e89945af
BLAKE2b-256 71b1f6ec5524e15a7d18faf59c57b63ea1c087c2f8f3c92d64bb0c166d6083bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page