Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_ce-0.1.45.tar.gz (18.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_ce-0.1.45-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.45-cp312-cp312-macosx_11_0_arm64.whl (28.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_ce-0.1.45-cp312-cp312-macosx_10_13_x86_64.whl (28.8 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_ce-0.1.45-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (81.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.45-cp311-cp311-macosx_11_0_arm64.whl (28.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_ce-0.1.45-cp311-cp311-macosx_10_9_x86_64.whl (29.2 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_ce-0.1.45-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (78.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.45-cp310-cp310-macosx_11_0_arm64.whl (28.6 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_ce-0.1.45-cp310-cp310-macosx_10_9_x86_64.whl (29.3 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_ce-0.1.45.tar.gz.

File metadata

  • Download URL: dvt_ce-0.1.45.tar.gz
  • Upload date:
  • Size: 18.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dvt_ce-0.1.45.tar.gz
Algorithm Hash digest
SHA256 c815e6aef63d8a571cbcf4165684e3e47cbe85cf07f13be5529d7c943b9df613
MD5 97ad657ebf09648af58e03081e929b22
BLAKE2b-256 9fd5654b2110c1c741ff0fc02e0a4b69cb078752d4f3b50f5815eb965744bf2a

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.45-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.45-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 06f3a26062fb9f01a596c626e817bb4b4f57aef6f5b47020b1f6c3dd2e962649
MD5 6c4538aff1ff74d92d9b50a9f10d620c
BLAKE2b-256 bba57340e68b7602fde5e4b342809976caae674ed8b191275669fd0ac0c38fbb

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.45-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.45-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 75481f1731e2e2ae10a40c3f0d5b6932e5ae5c2c154e8ad5d0b4678456b661a1
MD5 065da20e06b7cee6c04c6e5abaa12fc4
BLAKE2b-256 f83c8fe4224e5bb7b63f7182e0d86ea0a12ad6b716a9e0bf9eb5b7c5db4d2abc

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.45-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.45-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 09724b6cad9cf038fbe2a44839f770634bca45051a418a3d3e009548ac6a7c2d
MD5 7f7b22c3cbc3ddec8da9c57c37e7eae9
BLAKE2b-256 f104002c6ec479af68b57f034097c83304e173ea9491462ac81bdfd1e5e4671f

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.45-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.45-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4961efae0bda84d6b6825c00e75187c9f345ae365c0a2ba90e9f4a8757038a36
MD5 42ff6aa9584cb6507d121fa0f2cc5c0f
BLAKE2b-256 abc70465e64edfa5ba4eb6c6271d20ea9094c35d8eb52a55ccebea367fb1642c

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.45-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.45-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 665ee291e315e6547b7089a241bb3b80dd08498d69e23ae864d5f7d7aed43c2e
MD5 9f655d01c9758ff9683e392215eecab7
BLAKE2b-256 ff4dd561b1dbce46fd25f6e8235913df70861f27716f6e481660b97e9c56504e

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.45-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.45-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 2ac050b1761e22a09ad975ec86d70cebf239483354ad9c32b24b803b67cb0d17
MD5 f84b6e1f3dc9cdd0798e6f30cd2edb32
BLAKE2b-256 11d83c6b7f53e447837a97464f41093b1785f0a080ade0caf4e7c7fc1822d6a5

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.45-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.45-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cc1d97f7bc7d3eeebced6a93d616c570799e0ccb89d5bf242ca1ae42a05226d7
MD5 ceaee8930a7e4370a92be984518e5163
BLAKE2b-256 ca51c80339811c92fcfc1aeaaa856f9de14c7a3d9d901411d8d9882c4e03d4e1

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.45-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.45-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bb83170d0fa9ba86be40fad9013f43a803d9a0d17b1e2d8123186041ada2d9cf
MD5 b470fa3412c857e7847799f99f98467f
BLAKE2b-256 8054e574ec9cebcbc1a78e0b1b1aea487c71e74a922b4100dff63ad268cc930b

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.45-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.45-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 da5bd36429f9152c15cb1bb021a9393604f6272dd4be08ef108d53b4e2881c63
MD5 035585466b66064952bcb6758112e716
BLAKE2b-256 5b676e8d7ea3548814728188fcc8e3cb57fcfbde10eaa81934ced0d54b8d7e53

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page