Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_ce-0.1.44.tar.gz (19.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_ce-0.1.44-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.44-cp312-cp312-macosx_11_0_arm64.whl (28.8 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_ce-0.1.44-cp312-cp312-macosx_10_13_x86_64.whl (29.2 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_ce-0.1.44-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.44-cp311-cp311-macosx_11_0_arm64.whl (28.9 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_ce-0.1.44-cp311-cp311-macosx_10_9_x86_64.whl (29.5 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_ce-0.1.44-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (78.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.44-cp310-cp310-macosx_11_0_arm64.whl (28.9 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_ce-0.1.44-cp310-cp310-macosx_10_9_x86_64.whl (29.6 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_ce-0.1.44.tar.gz.

File metadata

  • Download URL: dvt_ce-0.1.44.tar.gz
  • Upload date:
  • Size: 19.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dvt_ce-0.1.44.tar.gz
Algorithm Hash digest
SHA256 6f160e67d9e4d6a05fde2b7e921bff33a1b498a0c35a1bc09f3fdaeff9c8e94e
MD5 f33904a7aadb10aefb1867c313f6f7e1
BLAKE2b-256 87b855e8a30a29b40f89b4d0044d16bd1445015b63ea7acbfdd573643647770c

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.44-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.44-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4eefb9725d1ec3dcdeb01d02c8b7fe52fb82ea6c3f57a603a90607c6ed553710
MD5 1a51c23041b2588f811970f3d1bcdcc2
BLAKE2b-256 4886da18fd9f32278ce18b1b80ad6893b0c53c012410a292b323097d1d0ca56a

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.44-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.44-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 98d19135c3143ad4ac6708b0947b87e767272af4b44e0d4b2a2bf646b571f8dc
MD5 974020b28951928a6e6af0d111029756
BLAKE2b-256 3b337b77f5b658b3fa89306900eb405356e5773134eeef598a96aaf98bbf837b

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.44-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.44-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 7fd16e9f162cabe5d586c7f85d58ee4c16872357d72be7e7b1d2109e6f201f3f
MD5 39e9c8d2e5aa4a3945bd8676b25b9046
BLAKE2b-256 15ad85513e40045dd9e5eaf78920f970765271cd55a1fbdc5efaa9104d880e12

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.44-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.44-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 136ab7e84f8e700052bfa1e51d594bb7a68cb3eede4256420b830148c408a89a
MD5 04f11763ba510ec3c81887d3e480bd2d
BLAKE2b-256 940387a1ecfd2bda962c2787f95ffca61366c01322e51d5d171eaaff2d7ad0e5

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.44-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.44-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d9cff400de99580a4db61f2da97bcdbf54819a8d3023eaed79225c2c3bb826f4
MD5 437d1c6248a1011265e04e73aec8ba6a
BLAKE2b-256 a3b187ec4f16ed591de1c4b24ed55a12dc3bc04166d8667a263d6fc5daad89c7

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.44-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.44-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 f9c122253771fb37b72ed9d200ef31a5569e9d12cdbea6bf8e3f80dbef7afded
MD5 c49f94c8a5c1c49ab1d2d5d14565f635
BLAKE2b-256 bc16ddfc3f9f0156b76bd01910179877cb97cae171232ff03daea3873b50dca4

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.44-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.44-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 31f14ceaece3e45c4863cb2fe5449b79e86dc2a1bd7d4d96d4a5a174ec252d1d
MD5 04671b716bd363c15d3c920d51a6a50c
BLAKE2b-256 d88ea64b4ccbeec34700f2cd3a420d95884c617744aa2a82eec7eb2b7724b294

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.44-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.44-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b983fc71ed22fcd8c1d8e34d00e318e95d4f4d07eeb344e46dafa50ad8ab9538
MD5 e953c3b87ea13a92d78bbabab00744de
BLAKE2b-256 553a73fe11a902516bb511333a6b22de12c552375da54200efcdaee9f229ec86

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.44-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.44-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a53e4090fa5f592154b5fbbdb03a13afc8d79904ef750dbd189d2496e2b2ecf4
MD5 07a1533c0d6c6885e02c53ac2e3b610d
BLAKE2b-256 2a0928f023afacf3f1ddcbf12e38597ad72aedf8bd44545ba79abcc59495b78f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page