Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_core-0.1.53.tar.gz (19.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_core-0.1.53-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (84.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_core-0.1.53-cp312-cp312-macosx_11_0_arm64.whl (29.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_core-0.1.53-cp312-cp312-macosx_10_13_x86_64.whl (29.6 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_core-0.1.53-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (84.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_core-0.1.53-cp311-cp311-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_core-0.1.53-cp311-cp311-macosx_10_9_x86_64.whl (30.0 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_core-0.1.53-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (80.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_core-0.1.53-cp310-cp310-macosx_11_0_arm64.whl (29.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_core-0.1.53-cp310-cp310-macosx_10_9_x86_64.whl (30.0 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_core-0.1.53.tar.gz.

File metadata

  • Download URL: dvt_core-0.1.53.tar.gz
  • Upload date:
  • Size: 19.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dvt_core-0.1.53.tar.gz
Algorithm Hash digest
SHA256 ce8f40e8391e3b0f608ed1299661a13e555939a4c1cacb410e66aafc0e4d3fbe
MD5 d4fc6f08ade22641466a8dd5054131c3
BLAKE2b-256 fcfacd9dd073d838132346314a327a3661b6c70b6fd29156613c5a1c24726a27

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.53-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.53-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 000a2293f73dd0f758e03a075676e8a7f0dce93ea81ee3bd5b214d184f820a27
MD5 0e2ab4fbdccffd5ae7302d1fbd0fa251
BLAKE2b-256 d6539922404c8d98d94d348a17dff9384cb7a6642d22eee0d5aca166907103c2

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.53-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.53-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dc6366df9eeb85e117e3630612d10a3782a2ab9f5296dc5acfb3b561448cbaeb
MD5 e9fdcdcd3db5f4534a5dfc93422cbfe6
BLAKE2b-256 105e95ad2d73d7cbbe6e8b4b02975d1e442cabc90dc4bbad7d9b594261632c95

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.53-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.53-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 fc204fc9d7ed82e923b868c920e05a517973c07fa25f6c9b1afda94512f1454a
MD5 bc705738c8d6476bbc1fab60f7b54272
BLAKE2b-256 45050387cd2b12cc3c7b413a33790381ff5d51bf7f769d1ebcea634fcb4e7175

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.53-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.53-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c94d84420068fb541e468eb31d67368a18dd60ca55d4a758db7fbca06e043c0d
MD5 9e7bac3f966136d43cbd5d09df542f08
BLAKE2b-256 61885ae97b67f2c11d0468d895a974e95d92b06129ec5db6ce8fd8db798d29f6

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.53-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.53-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 72d4d48c44b59ef48461f377762756f4b08db196b70be5e1209c5875a857d80e
MD5 1f834510b301d9bb4fad8564bcb1c45a
BLAKE2b-256 f90c7b7ce1e29e6aab8be9365432f6ecd024f10590f74d5cf63f6d492f8b4694

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.53-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.53-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c22bd5b6ab91378fd5e4017c7a3b20d811e03f80264eaa1d068acbaa101035cd
MD5 330c0f6fe3167b8f65cd3afb7fad9ff2
BLAKE2b-256 2cb948c454d202df17c03eb825f84e702a250a215c049eb8094e57c44b725ab6

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.53-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.53-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 701635a6ee023d909960d60f22c3f08aac7609100b51d6f467583015572af4ee
MD5 9ccca2e8741198638154ccbe4b66aa06
BLAKE2b-256 b549d2443a489ee362ac704bb240b3d36efb0f4490491856a9438d73ea172faf

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.53-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.53-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 517bdc630aaba1b1bd2226eb80fcf4941a8a15ad6b1529df456468f98db16d2f
MD5 b50fe102fd422db1e185545c67d921cb
BLAKE2b-256 91c7cfea06a0ee4907d2984db16972ad6acdc9bbcdc7135be550d01272e3afff

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.53-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.53-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 f3ba348856f8eef9be17910fd0a4440a71c9b4548bb5edec89da701cbba7e92e
MD5 dbdd0ac9663b634380ec6c2992077645
BLAKE2b-256 ee2e988e741cc0607249c62bec2d68e88b65186f55a3f87d4b953ad554a30ef4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page