Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-core

Or with uv (recommended):

uv add dvt-core

This installs everything — dvt-core automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-core · dvt-adapters
GitHub dvt-core · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_core-0.1.55.tar.gz (20.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_core-0.1.55-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (88.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_core-0.1.55-cp312-cp312-macosx_11_0_arm64.whl (30.9 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_core-0.1.55-cp312-cp312-macosx_10_13_x86_64.whl (31.3 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_core-0.1.55-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (88.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_core-0.1.55-cp311-cp311-macosx_11_0_arm64.whl (31.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_core-0.1.55-cp311-cp311-macosx_10_9_x86_64.whl (31.6 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_core-0.1.55-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (84.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_core-0.1.55-cp310-cp310-macosx_11_0_arm64.whl (31.0 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_core-0.1.55-cp310-cp310-macosx_10_9_x86_64.whl (31.7 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_core-0.1.55.tar.gz.

File metadata

  • Download URL: dvt_core-0.1.55.tar.gz
  • Upload date:
  • Size: 20.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dvt_core-0.1.55.tar.gz
Algorithm Hash digest
SHA256 43e19b041a8c9e495c4fa268583357ba5ad2489a9ea901b186d51193a685ecba
MD5 e051c206f3a52aa7c99c284d349b99d9
BLAKE2b-256 7dec001ba92935b1b551b31ec3b38ee3e5aef5efa5509ca5e3d3161152aadd25

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.55-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.55-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ed013ff4081bdd9deb36173a60d02f3093c2daccc198bd2cdf3a8fa7681596cb
MD5 895849a78e83fbc64f608002225bf1c7
BLAKE2b-256 bddf176dde27eb4929bc46721df5ba0c3fe56516e30eb16bffbc82047c9f42b5

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.55-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.55-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 710be9d6382303fd42e10af85e5386170f34d191e2822f001b2db164c397056a
MD5 60477eb18fd076541b9dd9aa9a9d67ac
BLAKE2b-256 e8fb766d08337dd912d778ec5c5f3e2f28889202395f0e4159d3fb1c4e933acf

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.55-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.55-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 440e55b426b181d336321f077c739418858433d186597b0b3cfcf8d277b3381c
MD5 3b9a69d10f31e6547e468d2fcdcb91ba
BLAKE2b-256 06c38367de881f46c5b7172678b1999273daf3ca282e9f2f789e636a14d1c8b8

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.55-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.55-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8f87907a2ab07924897d7987c45827a12639b47465631e9468b0a2530b6c20dd
MD5 60f8cf6034d2d83eb9539bfd602ee8a3
BLAKE2b-256 072923b66f02e7a5600d890229cb41ca07722a1175d9b6563c2aae753660dabb

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.55-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.55-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6befca8cbceac6239e6cc1fa3838d8d5068844863d293535a446f156455354e7
MD5 384968d7e0bd861ec44de41e7052336a
BLAKE2b-256 5634f83856beb2d90833a6e07a29236eb13c223eeff5bb153a28102d3394f310

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.55-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.55-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 71ed2c5994e51263aa0981782810c2b6f2844a671d15046c1459981f44735372
MD5 f7f2fe65700fb23d773028be26a911e3
BLAKE2b-256 c624d2ebf7ab26a4888afc983518b6df9a23b80872812fe6a940f20f6115dc9f

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.55-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.55-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1f5d96800b2adc27a99bb1567e9a76c0659de66ef6896b9638dd98ba584a7c05
MD5 ecca95cce27bbb5f0e3ab166e5477b25
BLAKE2b-256 5ed1300dde9e92cfb499b75f4928d533af1fac655370de8a097e996b4f250635

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.55-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.55-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bec45c3742b30edccba9611447586feb6307d108e84434577680d825eeb0ba72
MD5 8871f383b6c245ca5454c8accb631ebb
BLAKE2b-256 9439a2e00042378486d4cbed6fbf4be9f5bd5600fc7f09dceca29ffd1c6f6e8a

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.55-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.55-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 4d4111aeda731a5b53cde2984481fdc08ef6f90fdab27de6512c4a90d3debd3f
MD5 003a6bc3a46efd34419c38718441a8bd
BLAKE2b-256 31c22e4349d38844082a36ce2c60f5d657decc1a3263cd8e98bd62f8ab886d01

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page