Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_core-0.1.54.tar.gz (19.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_core-0.1.54-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (84.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_core-0.1.54-cp312-cp312-macosx_11_0_arm64.whl (29.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_core-0.1.54-cp312-cp312-macosx_10_13_x86_64.whl (29.6 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_core-0.1.54-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (84.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_core-0.1.54-cp311-cp311-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_core-0.1.54-cp311-cp311-macosx_10_9_x86_64.whl (30.0 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_core-0.1.54-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (80.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_core-0.1.54-cp310-cp310-macosx_11_0_arm64.whl (29.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_core-0.1.54-cp310-cp310-macosx_10_9_x86_64.whl (30.1 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_core-0.1.54.tar.gz.

File metadata

  • Download URL: dvt_core-0.1.54.tar.gz
  • Upload date:
  • Size: 19.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dvt_core-0.1.54.tar.gz
Algorithm Hash digest
SHA256 05e8b0e1b60beca6c10e6300f0fed52c31d294506649b7a7954efe304f58c7b2
MD5 9e37666578e01fb2081d8839821a5a6c
BLAKE2b-256 426f5d090c0bd599529d2ee1651408e91d9325feeb5d46f83c289c9516101186

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.54-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.54-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e257c3d87e90f3e337118fa2a0d227796fbe03eb934fb61f2d90c224795b47a7
MD5 af5ddd1e646128ff8952b6d6494c248f
BLAKE2b-256 d696cd91bda6d9a692304930657e19e5c1b7294ca6e4e6ee346bf5b1fc82a4ee

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.54-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.54-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6ba654f017eeacd392c82ed9cb92ebbebebd98bbd42b0f8a86932c3c55e0fd97
MD5 d161c64abd61b3cf2a8d93af3a0f2fdd
BLAKE2b-256 7ab690157976d5e4642b88e11914d5aa65523a7b1e360e62270a8152a9171bff

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.54-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.54-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 1b0aaba56b1196ef923c09a669033e1e289283c5c5b9fd4083354d78195b2857
MD5 1ccc7a6da8ed4b88e6bb11cf0f5da865
BLAKE2b-256 f9e11b616e39e9111c57cd802af594428129f3b452a66fd8b0eae1ce22c5f620

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.54-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.54-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 54ab6236b476c000de8cca3d73005abf52136b62c65a48f424074b96d871dfb0
MD5 e6e17edb813f6470fe30effd35786f39
BLAKE2b-256 e8583c4bf5c63298e33d7fb93d28a6bb9c5810086b21cebba6ab0e70817a8a67

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.54-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.54-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b5bef30ed7dfb36db7bbd97b6ea552eababb09fe88b9578a9bd63ea18defb311
MD5 c94f12be9a43d7492fbda20aa59b33dd
BLAKE2b-256 9e813855cb553b21afa9d8cb4249d0a5ac382084acab0f6681346b9b7abc0c6a

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.54-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.54-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 58c331c5573a3ae6441c380225536fac863ff11b443b2662eebfeac5c6e5ce58
MD5 3def902df7bf516d6d739cacf86023c9
BLAKE2b-256 1fa0d21c2e3999b53ade57f33b9bb7d42ac90730f2c6e069f6c4f13cd25fb506

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.54-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.54-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c395b1945df447768d44969d6b077548b7a4b8603727ab81e97b7347c6f69634
MD5 ea76a176116c80b74905c01f45de1a41
BLAKE2b-256 1425fdb9b7442e5f704cfdaf7a68a28f15bfb056fbeb5001cbed1d5359286339

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.54-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.54-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 26d2b2ba5b970863f895b800c309b52ea1574272e11df9f17ff6d65816f9be2d
MD5 16938c2b12e465802f1c5ea1f93c5905
BLAKE2b-256 2e30b18a348f088ebfa9e71ce53110c239067a9dc1d97283bd4d3a38366121a0

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.54-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.54-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 84eacaa19f664d1a1166997bae4d863f53e0960b006e94c0bfa733f19874962a
MD5 ea716d1ea0604bf547af414ab9f2ef5e
BLAKE2b-256 16a8d2b1846f1bb6f5468ea4f2042da4d2e8ebf2e1f6c5d1241ae7ca92c677b2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page