Skip to main content

DVT — cross-engine data transformation tool with DuckDB federation.

Project description

DVT Logo

DVT — Data Virtualization Tool

Connect every database. Transform across engines. Materialize anywhere.

PyPI PyPI Adapters Python Discord License


DVT is a cross-engine data transformation tool built on dbt-core. Write SQL models that reference sources on any database, and DVT automatically handles cross-engine data movement and materializes results to any target.

No custom connectors. No complex config. Just SQL.


How It Works

DVT extends dbt with federated query execution. When your sources and target live on the same engine, DVT pushes SQL directly to the database (identical to dbt). When they're on different engines, DVT transparently extracts, joins, and loads across engines:

flowchart LR
    subgraph Sources
        PG[(PostgreSQL)]
        MY[(MySQL)]
        SF[(Snowflake)]
        OR[(Oracle)]
    end

    subgraph DVT["DVT Engine"]
        direction TB
        SLING1[/"Sling Extract"/]
        DUCK[("DuckDB Cache<br/>.dvt/cache.duckdb")]
        SQL["Model SQL<br/>(joins, transforms)"]
        SLING2[/"Sling Load"/]
        SLING1 --> DUCK --> SQL --> SLING2
    end

    subgraph Targets
        TGT1[(Snowflake)]
        TGT2[(Databricks)]
        TGT3[(PostgreSQL)]
    end

    PG --> SLING1
    MY --> SLING1
    SF --> SLING1
    OR --> SLING1

    SLING2 --> TGT1
    SLING2 --> TGT2
    SLING2 --> TGT3

    style DVT fill:#f0f4ff,stroke:#336791,stroke-width:2px
    style DUCK fill:#FFF000,stroke:#333,color:#333
    style SLING1 fill:#0094b3,stroke:#333,color:#fff
    style SLING2 fill:#0094b3,stroke:#333,color:#fff
    style SQL fill:#29B5E8,stroke:#333,color:#fff

Two Execution Paths

Path When How
Pushdown Source and target on same engine SQL runs directly on the database via adapter — identical to dbt
Extraction Sources on different engines Sling extracts → DuckDB joins → Sling loads to target

The user never thinks about this — DVT decides the path automatically.


Supported Engines

13 engines in one package (dvt-adapters):

Engine Type Engine Type
🐘 PostgreSQL OLTP ❄️ Snowflake Cloud DW
🐬 MySQL OLTP 🧱 Databricks Cloud DW
🦭 MariaDB OLTP 🔷 BigQuery Cloud DW
🟥 SQL Server OLTP 🟧 Redshift Cloud DW
🔴 Oracle OLTP 🦆 DuckDB Embedded
Spark Distributed 🔵 Fabric Cloud DW
MySQL 5 Legacy

Any source → Any target. DVT handles the data movement.


Installation

pip install dvt-ce

Or with uv (recommended):

uv add dvt-ce

This installs everything — dvt-ce automatically pulls in dvt-adapters (all 13 engines), Sling, DuckDB, and all core dependencies.

Then bootstrap your environment:

dvt sync    # Installs database drivers, DuckDB extensions, Sling binary, cloud SDKs

Quick Start

dvt init my_project && cd my_project   # Scaffold project
dvt sync                                # Install everything
dvt debug                               # Test all connections
dvt seed                                # Load CSV seed data
dvt run                                 # Run all models
dvt docs generate && dvt docs serve     # Engine-colored lineage docs

Configuration

Connections (~/.dvt/profiles.yml)

my_project:
  target: pg_dev
  outputs:
    pg_dev:
      type: postgres
      host: localhost
      port: 5432
      user: analyst
      password: secret
      dbname: warehouse
      schema: public

    sf_prod:
      type: snowflake
      account: my-account
      user: loader
      password: secret
      database: ANALYTICS
      schema: PUBLIC
      warehouse: COMPUTE_WH

    mysql_crm:
      type: mysql
      host: mysql.example.com
      port: 3306
      user: reader
      password: secret
      database: crm

Sources (models/sources.yml)

The connection: field maps sources to their engine:

sources:
  - name: app_db           # On default target (no connection: needed)
    schema: public
    tables:
      - name: users
      - name: orders

  - name: crm              # On MySQL
    connection: mysql_crm
    schema: crm
    tables:
      - name: customers

  - name: marketing        # On Snowflake
    connection: sf_prod
    schema: PUBLIC
    tables:
      - name: campaigns

Cross-Engine Model

-- models/dim_customer_campaigns.sql
{{ config(materialized='table', target='sf_prod') }}

SELECT
    u.user_id,
    u.email,
    c.customer_name,
    m.campaign_name
FROM {{ source('app_db', 'users') }} u           -- Postgres
LEFT JOIN {{ source('crm', 'customers') }} c      -- MySQL
    ON u.email = c.email
LEFT JOIN {{ source('marketing', 'campaigns') }} m -- Snowflake
    ON u.user_id = m.user_id

DVT detects the 3 engines, extracts to DuckDB, executes the join, loads to Snowflake. You see standard dbt output.

Incremental Models

{{ config(materialized='incremental', incremental_strategy='append', target='sf_prod') }}

SELECT * FROM {{ source('app_db', 'orders') }}
{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

DVT reads the watermark from the target, extracts only new rows, appends them.


Two Dialects, One Project

Path You Write Runs On
Pushdown Target's native SQL (Snowflake SQL, T-SQL, etc.) Target database
Extraction DuckDB SQL (Postgres-like) Local DuckDB cache

Both coexist naturally. The dialect is determined by the execution path, not config.


Commands

Core

Command Description
dvt run Execute models against targets
dvt run --full-refresh Rebuild everything from scratch
dvt run --select +model_name Run model and all ancestors
dvt build Seeds + models + snapshots + tests in DAG order
dvt seed Load CSVs via Sling (10-100x faster than dbt)
dvt test Run data tests
dvt compile Compile SQL without executing

DVT-Specific

Command Description
dvt sync Self-healing env bootstrap (drivers, DuckDB, Sling, cloud SDKs)
dvt debug Test all connections with clean status output
dvt show --select model Query locally via DuckDB (no target needed)
dvt retract Drop models from targets in reverse DAG order
dvt retract --select +model Drop a model and its entire upstream chain
dvt clean Remove build artifacts + DuckDB cache

Documentation

Command Description
dvt docs generate Cross-engine catalog with engine-colored lineage
dvt docs serve Serve documentation website

The docs UI features:

  • Engine-colored nodes (each database has its brand color)
  • Connection badges on every source and model
  • Native column types from each engine
  • Target and engine info in detail panels

DuckDB Cache

DVT maintains a persistent cache at .dvt/cache.duckdb:

  • Source tables: {source}__{table} — shared across models, reused between runs
  • Model results: __model__{name} — for incremental {{ this }} references
  • dvt run --full-refresh rebuilds the cache
  • dvt clean deletes .dvt/ entirely

--target Philosophy

--target switches environments, not engines:

dvt run --target dev_snowflake     # Dev Snowflake
dvt run --target prod_snowflake    # Prod Snowflake  ← Same engine, different env

Pushdown models use the target's SQL dialect. Extraction models use DuckDB SQL and are unaffected by target changes.


dbt Compatibility

All dbt projects are valid DVT projects. When using a single adapter with no cross-engine references, DVT behaves identically to dbt.


Community

DVT Discord


Links

PyPI dvt-ce · dvt-adapters
GitHub dvt-ce · dvt-adapters

Built On

DVT stands on the shoulders of three exceptional open-source projects:

Project Role in DVT License
dbt-core DAG orchestration, SQL models, Jinja, testing, docs, adapters Apache 2.0
Sling High-performance data movement across 30+ connectors (free tier) Apache 2.0
DuckDB Local analytics engine — extraction compute, caching, dvt show MIT

We are grateful to dbt Labs, Sling Data, and the DuckDB Foundation for building and open-sourcing these tools.

License

DVT is licensed under the Apache License 2.0.

Copyright 2025-2026 Hesham Badawi.
Licensed under the Apache License, Version 2.0.

Built by Hesham Badawi — data engineer, for data engineers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvt_core-0.1.51.tar.gz (19.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dvt_core-0.1.51-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (84.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dvt_core-0.1.51-cp312-cp312-macosx_11_0_arm64.whl (29.1 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dvt_core-0.1.51-cp312-cp312-macosx_10_13_x86_64.whl (29.5 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

dvt_core-0.1.51-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (83.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dvt_core-0.1.51-cp311-cp311-macosx_11_0_arm64.whl (29.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dvt_core-0.1.51-cp311-cp311-macosx_10_9_x86_64.whl (29.8 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

dvt_core-0.1.51-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (80.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dvt_core-0.1.51-cp310-cp310-macosx_11_0_arm64.whl (29.2 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dvt_core-0.1.51-cp310-cp310-macosx_10_9_x86_64.whl (29.9 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file dvt_core-0.1.51.tar.gz.

File metadata

  • Download URL: dvt_core-0.1.51.tar.gz
  • Upload date:
  • Size: 19.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dvt_core-0.1.51.tar.gz
Algorithm Hash digest
SHA256 6150ac9112726ddfec8f2a0345a793d5a978a27463e4d05507c784ba006ad535
MD5 b8cdd00b18d0587aec4326c3103d3a4e
BLAKE2b-256 30c30e49c249688c58737d46ebad6c5176030177724e964aeb96660388ab8ebe

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.51-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.51-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a223c573c5f3898dca611bc369ec66f74ad3fac33a420667a6aa4bb1a04b2c0b
MD5 a80751041282b72300e44e774b0f187b
BLAKE2b-256 39473a3172063cabe3b1ad2c1af8c3887cb9905fc9c53afe53ac0eb7ba2f9e08

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.51-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.51-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 84b9498376bdcd870f249cd60820a3ccb8d41cb937849aec8661d8731c33e62b
MD5 e37934fb63260c75740b4600ce8d7690
BLAKE2b-256 d5c1ef8d5e72e0e3ee6e3bcae3c5bdbbfe6b46c1dfb8efdf06b54a0e00bec374

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.51-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.51-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 d0890433c5089974836d4921036c1b087d6d9502cd9b290cd59ca8d54c281657
MD5 34495584f2eafbb8820d34093a39025f
BLAKE2b-256 1d23c73e020c9e7c7a056fc7680f5e1d9ae5a7f2dfff34b0860ca7acd77e6b9c

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.51-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.51-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e2810220a1015a314689ed5ecef85c37976bfe3de4311a87e7567f73b04849a0
MD5 2636e4749eb089a73f2fd12464629904
BLAKE2b-256 5278b1d4f0b0abcfa9ddb0f7d20e0d79f34ad3e24e51126637971b8c4e000674

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.51-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.51-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a978a7471dff3196876bb51ada7f0520696d2593a79270b0a5df697b28e2c17f
MD5 daaa541e003e15bba100c50f56755ff4
BLAKE2b-256 403fa17097a8e9916dbc166420e4d49341316e8eb3a8f79aa739fd594bea21d1

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.51-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.51-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 285344e2ce67c83954eee2172b0a77c4066f335a3588683e9cb40ba0f7f4f797
MD5 022bdfa781465afa6300c821052ec693
BLAKE2b-256 9eeade7ccca46a5622febc5365ddb4f60641051de1fac826b4ecbb4a0a9d12b5

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.51-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.51-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6370f8f712008e42d6e3ae14144722944480164b10a9feebdbd77a62919ef675
MD5 a960c3ae98d0d74bb369e799899e81d3
BLAKE2b-256 726e3f2db7ad9093327203b78e844a9e845c485e3c2d9513c7147ea8465a4bfd

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.51-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.51-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 de7b6f542a0986858ce3fcb19c6eea07683c74b2c457251f51c63be61ebd80e7
MD5 b78336a903d66bf3dd998e4c768e3405
BLAKE2b-256 05656c5cc2079655bd9e0bda870942246dc1d2dab56f4c0ea5678d39d1f6bf15

See more details on using hashes here.

File details

Details for the file dvt_core-0.1.51-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dvt_core-0.1.51-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 1b762b52491f797093487fb6f753520d1d8660605a04fba01ca92568190dcb71
MD5 0b50aee81da3c86be5341e162e011df9
BLAKE2b-256 31aec0a0fc54da9b735fadc0c52f148bc6e12f2ba536a89d7a17204ebfc76a20

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page