Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_core-0.1.52.tar.gz (19.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_core-0.1.52-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (84.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_core-0.1.52-cp312-cp312-macosx_11_0_arm64.whl (29.1 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_core-0.1.52-cp312-cp312-macosx_10_13_x86_64.whl (29.6 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_core-0.1.52-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (83.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_core-0.1.52-cp311-cp311-macosx_11_0_arm64.whl (29.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_core-0.1.52-cp311-cp311-macosx_10_9_x86_64.whl (29.9 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_core-0.1.52-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (80.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_core-0.1.52-cp310-cp310-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_core-0.1.52-cp310-cp310-macosx_10_9_x86_64.whl (30.0 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_core-0.1.52.tar.gz.

File metadata

  • Download URL: dvt_core-0.1.52.tar.gz
  • Upload date:
  • Size: 19.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dvt_core-0.1.52.tar.gz
Algorithm Hash digest
SHA256 bd157fedf5e9db111ed136f144588580840ec2ca5a595716f0b72e6e34861fef
MD5 9286893029e64c80f2a03d115b41fb0c
BLAKE2b-256 dd8ea1160e021dbaa112d5fd93390b1d8d5e3fc789dc496c57914b9ad49cd173

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.52-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.52-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 42aaa2908523fd9120fa08715dee6afe00164b48b5f3642e52af6081ca79079d
MD5 1416089e1f7b9c08a85078797666fd28
BLAKE2b-256 667ab96c9035a487d79402fd68a102ef98b759a8ae5aabe663d473c974655573

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.52-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.52-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ccd63b4f188d73e318f6a1aaccda6c7801ac35c941790843f75b466a1e5b7216
MD5 c5216777bcbbf7084b9fe1b7ed8b901f
BLAKE2b-256 fa77a5813bd571c4625388b226f4d29968c07f4f589b3bdd75f957ee4735f04a

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.52-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.52-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 7d087e3008a61e4719c980a75f5bee0e6ba96712c64cf67b94c95773374be5e8
MD5 3ce318543b1ea6601b7cc8c5e2066f54
BLAKE2b-256 86ef2fc3c08ef705255f0d77829535fbb1cf276c3bb0d6d25c329e2c0326d0eb

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.52-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.52-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 74c99dbbfece24515c87b0db628b89d3d7d1cac5ef9192a6560f04587fcea4cd
MD5 1e3013b28c5671cce83252593c4f8dfb
BLAKE2b-256 9be9955a41747d625ad914fceadb1760a7f9e5c98a2791d18736f6bd6a1b6755

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.52-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.52-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7766e3e2f77eb6161d92839ca41b80487b2db87098ebdeefa850527ed3b9004b
MD5 b770d31f39e3984903fa47d70c4371aa
BLAKE2b-256 67e415fa964f069646ec9714eb798932467ea1d623cd820138c07474ac9b9af2

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.52-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.52-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 038f1d23601778b3467908bf7b63862eca14218ad5af651fc1a8e5cf995fc19c
MD5 bcf747aec5cb035bf640cf3a8dd9a199
BLAKE2b-256 af7568dc65441e22d4d827146f57a9948c261f92f36116bfb4b526b56b053901

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.52-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.52-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c321f349827ecdb302a8f37c705d9e70ce9dde7ed7ca1601f64632ec6e947ebc
MD5 d81b1c87f3d688f5403cdb3de47a7e91
BLAKE2b-256 a80b64cddc40c1072997b8f7c0ae8ae2ee2e376bd2acd7e5010bc2324d0286fe

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.52-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.52-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d54279cf6eae322da7435ba1efa2931539c0a3ec02c45f1aa79dd7379dfc5ee8
MD5 b4e3ac54b58fc4bbf0433383f37625be
BLAKE2b-256 d14ab7f9f7d0e21d3de7f0233e9cfe6643708abdb08affb19716bf6bd7abc324

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.52-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.52-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 2487e0ccc4a4e505ad9679c410ade6926e0b449545a19195429c759061e2969c
MD5 74e729f1ee3a0669e665813a2bb89739
BLAKE2b-256 1155820ddf6d1b74a584bb39720a16267866abf3281d523d052717d9a803a723

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page