Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_ce-0.1.46.tar.gz (18.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_ce-0.1.46-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.46-cp312-cp312-macosx_11_0_arm64.whl (28.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_ce-0.1.46-cp312-cp312-macosx_10_13_x86_64.whl (28.9 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_ce-0.1.46-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (81.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.46-cp311-cp311-macosx_11_0_arm64.whl (28.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_ce-0.1.46-cp311-cp311-macosx_10_9_x86_64.whl (29.2 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_ce-0.1.46-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (78.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.46-cp310-cp310-macosx_11_0_arm64.whl (28.6 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_ce-0.1.46-cp310-cp310-macosx_10_9_x86_64.whl (29.3 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_ce-0.1.46.tar.gz.

File metadata

  • Download URL: dvt_ce-0.1.46.tar.gz
  • Upload date:
  • Size: 18.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dvt_ce-0.1.46.tar.gz
Algorithm Hash digest
SHA256 b82ca19ca01ad1c797bfbfec0c76b23b565676ccace52253d81458a3e56450a1
MD5 8d16c6e39fe6809af9a546edd32d9b9f
BLAKE2b-256 c1fb55b1dc00eede1444ec5a01174ff8ed6c9465047606d7feb7c4c1e6d6b805

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.46-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.46-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 81556e6c7f6e5b12e02fbaa4556deec36608602066c0058f9562e8d1811dc812
MD5 acd6dc838be6af21f7770dee43ada9de
BLAKE2b-256 d78f9f2a36b3af7925b5a1c91946b8ffb09c23369db0bc7a5c5a394cbf7a4dd0

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.46-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.46-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 caf33aa1725f6c772f9cca0b0ea5f990299e9cf7eb350a6a1caf7f404f773375
MD5 cf07c763ce7d9119874e87e7ffae82a4
BLAKE2b-256 2b70426c4b251e421d70f4467239b17b5cf9b578cd6a9a443d27539db462698a

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.46-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.46-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 9e2dc743fcd0c4bc37d6dc46f4c862bf85e870586bb42d2ea7bd1dd0b1c1d483
MD5 ef896463bdcde97d13a1b67067f7f5a5
BLAKE2b-256 a1f208df7fc1af834b8187e5c645a4f517c10502a49effc6394e4725891d0237

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.46-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.46-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 91879602d7fd8fa4b3d68b21705f1bb9c0eb52c57bae777fe7645eb2c4b9497e
MD5 6d17b834112185f113e32f07a44a92f6
BLAKE2b-256 d3e3f94bc9456ebf58cd9fa78d3e54892cf46d08404f53a2039dcafa5a9b8fc7

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.46-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.46-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 407bf3cee3618203f3eebad27e37888480574c2c9f1af6bf60e2c7dd7c42a39e
MD5 d06ebf399a650e0a52d23a7bbfb0e6eb
BLAKE2b-256 325b0da67b8c5fbe5e5f47d9c22cb38cd5a91d8a84a856d93f513d3512e1dba1

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.46-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.46-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 979403e99bd4b148bb7af6b33016cf189fe08d309c3f78a1ca7d4c94695f2cf1
MD5 989f60d982ab7df74efec0bcfb1c1e45
BLAKE2b-256 f2036662fc464fe04c9ebeb5c11741b439a37be2d775df53adfc2fa784c25c1f

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.46-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.46-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a6fe1cd2a22d27a2f3f1b73eda6db3d7216b4b137c5d85f73b10861b9a91c10b
MD5 2999db6b3514605f2d07e622e2c08802
BLAKE2b-256 880ffc2fc2657acf1e655566992fe8283719567ff661e63b3313b36d43603967

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.46-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.46-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 946d4f7ceafb9dd7b27024517ec8686459aee5f8752b8266b68370113e5030af
MD5 48abab6e1512c254013c59cb0e97cbec
BLAKE2b-256 ce1b5217fd244ee0317ceb402910c087e3bce742ecc3008928c7443a46e02c48

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.46-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.46-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ad513717d27bed612d22f533e44c1a86472593a3e20ff8186ccaab8a2a9a2dbe
MD5 aa49785a14c96cbaf7b29882101b9f31
BLAKE2b-256 958d48f7068272ba9f9f3952a3d41936f06782ee939bb22790754c39976c07e4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page