Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_ce-0.1.41.tar.gz (18.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_ce-0.1.41-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.41-cp312-cp312-macosx_11_0_arm64.whl (28.7 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_ce-0.1.41-cp312-cp312-macosx_10_13_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_ce-0.1.41-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.41-cp311-cp311-macosx_11_0_arm64.whl (28.8 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_ce-0.1.41-cp311-cp311-macosx_10_9_x86_64.whl (29.5 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_ce-0.1.41-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (78.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_ce-0.1.41-cp310-cp310-macosx_11_0_arm64.whl (28.9 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_ce-0.1.41-cp310-cp310-macosx_10_9_x86_64.whl (29.6 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_ce-0.1.41.tar.gz.

File metadata

  • Download URL: dvt_ce-0.1.41.tar.gz
  • Upload date:
  • Size: 18.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dvt_ce-0.1.41.tar.gz
Algorithm Hash digest
SHA256 65bfe25f7139ecc03d82bb8c09fc8cc0b896c9424b0e63191133563a819b3e43
MD5 6a46c359cb203d123e682ad51d5e0a53
BLAKE2b-256 c8ecca0dd5e2361e858f5cd34c324d8120e5652294abc5507adb02f892213428

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.41-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.41-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f28675903ef029f5667e0702417ba1a4029a76c9c4f3a2b32739e43c2a0eb7f2
MD5 74c982e6059add23d9b90b43b196a5df
BLAKE2b-256 8c87cdb8c850b454b48792dbf8076b49c3234dcbde2060728522677da63a790d

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.41-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.41-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1b415dc835cff29282dfe7a2d6df1dd9abbd907712bb941f79bb9e8b92f3c9a5
MD5 db80edd26c04e52feefe81c9db1a70b1
BLAKE2b-256 f87ef830d410b960a9a5a21355c1cfad58d809bbcb44005a0a20b8c88beaf259

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.41-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.41-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 b77a627fc19233c8b0522a7874bcbc207e69f17a341da9a5267d40cfeafc01fb
MD5 098ee1843712554c4656e7554d70edd7
BLAKE2b-256 70651aec33c78d2829c725b0dffadba495de7a13d8d7ffce9df03f177b155e51

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.41-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.41-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d9f9091f440f4df4c852062d115e38599fbed84cca56f7c4c496f2edfbd08058
MD5 86e46a33676fc31724828a0dcb3594e6
BLAKE2b-256 d25a4033da09adcf0f5a7d4379fab0d5ed01191925bfc8e888f893da47a9c82e

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.41-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.41-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 85cee7183bbec936b6960cb8e4210db0d6d1eaa2fa9ed9e06038470d32dac63f
MD5 24c643d17e18eac79d78438461bc8e26
BLAKE2b-256 87140f17a2003173da36324359fa1ea90791f7e681a45460a4f661aeec58eacb

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.41-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.41-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 14cf67eb7e8484ea54595e218d03835fe194e28536e0902c161780e1e12aca36
MD5 a5ea7f5424f4bcc658eee38c4711b628
BLAKE2b-256 3748a8358f79d9d3a8147a72867ea6214a22d488c7d42a4e9439fa32691713ea

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.41-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.41-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 83690840030748dbc63001094385560bd4e5e25d6642d12ff8e4ee25aef20784
MD5 8f8ab0fffc355ed457e6acbb9ae1e386
BLAKE2b-256 6f0ffb625c363a0c2f4d253a90a66c1d045ec4494132207b0258f148191912ba

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.41-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.41-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c0a527151ed05a890e74cff86348b387fec056ba7d503f69c30df4f1961df7d8
MD5 ec366ee70bfba2502b0ea1e79b2b0833
BLAKE2b-256 33ec5e1da2a6a01dadb49c87775107d8af4de31979669ae36f3aa01361edfe10

See more details on using hashes here.

File details

Details for the file dvt_ce-0.1.41-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_ce-0.1.41-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 032e4b25ca9d26f291998e6e696a37c1379a71d7d9ce5e80a8ba5c3ccdd1d003
MD5 f2376c5fb33b2138a3191cfe7c7cad9c
BLAKE2b-256 d005d8a10c25871b66c6c9ae832af22c1a1b3bd03dfc97f69266cd3f0973f040

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page