Skip to main content

Reverse ETL for the code-first data stack

Project description

English | 日本語

drt logo

drt — data reverse tool

Reverse ETL for the code-first data stack.

CI codecov PyPI drt-core downloads dagster-drt downloads License Python GitHub Sponsors Open in GitHub Codespaces

All Contributors

drt syncs data from your data warehouse to external services — declaratively, via YAML and CLI. Think dbt rundrt run. Same developer experience, opposite data direction.

drt quickstart demo

pip install drt-core          # core (DuckDB included)
drt init && drt run

Why drt?

Problem drt's answer
Census/Hightouch are expensive SaaS Free, self-hosted OSS
GUI-first tools don't fit CI/CD CLI + YAML, Git-native
dbt/dlt ecosystem has no reverse leg Same philosophy, same DX
LLM/MCP era makes GUI SaaS overkill LLM-native by design

What's always free? All connectors, CLI, MCP server, and sync engine. See OPEN_CORE.md for the open core boundary.


Quickstart

No cloud accounts needed — DuckDB + httpbin.org, three commands.

pip install drt-core
mkdir my-drt-project && cd my-drt-project
drt init --template duckdb_to_rest

That scaffolds a runnable syncs/duckdb_to_rest.yml. Seed a tiny DuckDB table and run:

python -c "
import duckdb
c = duckdb.connect('warehouse.duckdb')
c.execute('''CREATE TABLE IF NOT EXISTS users AS SELECT * FROM (VALUES
  (1, 'Alice', 'alice@example.com'),
  (2, 'Bob',   'bob@example.com'),
  (3, 'Carol', 'carol@example.com')
) t(id, name, email)''')
c.close()
"
drt run --dry-run   # preview, no data sent
drt run             # POST each row to httpbin.org
drt status          # check results

Other starter templates

drt init --template list             # see all available templates
drt init --template postgres_to_slack
drt init --template duckdb_to_hubspot

Each template prints next-steps for the env vars / source data it needs. See examples/ for the full collection (Discord, Google Sheets, GitHub Actions, MySQL, ClickHouse, BigQuery, …) and docs/connectors/ for per-connector reference.

Customizing your sync

For a guided wizard that walks you through profile + project setup:

drt init   # interactive — picks a source, configures profile, scaffolds project

Both flows produce the same project shape (drt_project.yml, syncs/, .drt/). drt sources --detailed and drt destinations --detailed print every connector's required env vars and a sample YAML stanza — useful when hand-authoring beyond the templates.


CLI Reference

drt init                    # initialize project
drt list                    # list sync definitions
drt sources                 # list available source connectors
drt destinations            # list available destination connectors
drt run                     # run all syncs
drt run --select <name>     # run a specific sync
drt run --all               # discover and run all syncs
drt run --select tag:<tag>  # run syncs matching a tag
drt run --threads 4         # parallel sync execution
drt run --dry-run           # dry run
drt run --verbose           # show row-level error details
drt run --output json       # structured JSON output for CI/scripting
drt run --log-format json   # structured JSON logging to stderr
drt run --profile prd       # override profile (or DRT_PROFILE env var)
drt run --cursor-value '…'  # override watermark cursor for backfill
drt test                    # run post-sync validation tests
drt test --select <name>    # test a specific sync
drt validate                # validate sync YAML configs
drt status                  # show recent sync status
drt status --output json    # JSON output for status
drt profile list            # list credential profiles in ~/.drt/profiles.yml
drt profile show <name>     # show a profile (secrets masked)
drt profile test <name>     # verify a profile's source connectivity
drt profile add <name>      # interactively add a profile
drt profile remove <name>   # remove a profile
drt serve                   # start HTTP webhook endpoint
drt docs generate --format mermaid  # print project DAG as Mermaid
drt mcp run                 # start MCP server (requires drt-core[mcp])
drt --install-completion    # install shell completion (bash/zsh/fish)
drt --show-completion       # show completion script

Visualize your syncs

Generate a Mermaid DAG from your local drt_project.yml and syncs/*.yml files:

drt docs generate --format mermaid > dag.md
graph LR
    subgraph Sources
        src_bigquery_prod["bigquery_prod<br/><i>bigquery</i>"]
    end
    subgraph Syncs
        sync_users_to_hubspot{{"users_to_hubspot<br/><i>upsert</i>"}}
        sync_accounts_to_hubspot{{"accounts_to_hubspot<br/><i>upsert</i>"}}
    end
    subgraph Destinations
        dst_hubspot_contacts["hubspot (contacts)<br/><i>hubspot</i>"]
    end
    src_bigquery_prod -->|extract| sync_users_to_hubspot
    src_bigquery_prod -->|extract| sync_accounts_to_hubspot
    sync_users_to_hubspot -->|load| dst_hubspot_contacts
    sync_accounts_to_hubspot -->|load| dst_hubspot_contacts
    sync_users_to_hubspot -.lookup.-> sync_accounts_to_hubspot

Shell completion

Shell completion is supported for bash, zsh, and fish:

# Recommended: auto-install for your current shell (idempotent)
drt --install-completion

# Or manually add to your shell config (run once from the target shell)
drt --show-completion >> ~/.bashrc   # bash
drt --show-completion >> ~/.zshrc    # zsh
drt --show-completion > ~/.config/fish/completions/drt.fish  # fish

Note: --show-completion outputs the script for your current shell. Run it from the shell you want to configure. The manual >> append is not idempotent — run it once only.

After installation, restart your shell and tab-complete commands and options.


MCP Server

Connect drt to Claude, Cursor, or any MCP-compatible client so you can run syncs, check status, and validate configs without leaving your AI environment.

pip install drt-core[mcp]
drt mcp run

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "drt": {
      "command": "drt",
      "args": ["mcp", "run"]
    }
  }
}

Available MCP tools:

Tool What it does
drt_list_syncs List all sync definitions
drt_run_sync Run a sync (supports dry_run + compute_diff for --diff parity)
drt_run_test Run post-sync validation tests (mirrors drt test)
drt_get_status Get last run result(s)
drt_get_history Get recent sync run history
drt_validate Validate sync YAML configs
drt_get_schema Return JSON Schema for config files
drt_list_connectors List available sources and destinations
drt_doctor Environment diagnostics (mirrors drt doctor)

AI Skills for Claude Code

Install the official Claude Code skills to generate YAML, debug failures, and migrate from other tools — all from the chat interface.

Install via Plugin Marketplace (recommended)

/plugin marketplace add drt-hub/drt
/plugin install drt@drt-hub

Tip: Enable auto-update so you always get the latest skills when drt is updated: /plugin → Marketplaces → drt-hub → Enable auto-update

Manual install (slash commands)

Copy the files from .claude/commands/ into your drt project's .claude/commands/ directory.

Skill Trigger What it does
/drt-create-sync "create a sync" Generates valid sync YAML from your intent
/drt-debug "sync failed" Diagnoses a specific error and suggests fixes
/drt-troubleshoot "drt isn't working" Walks a full top-to-bottom diagnostic checklist
/drt-init "set up drt" Guides through project initialization
/drt-migrate "migrate from Census" Converts existing configs to drt YAML

Connectors

Per-connector reference: docs/connectors/ · Discoverable from the CLI: drt sources --detailed / drt destinations --detailed

Sources

Connector Status Install Auth
BigQuery ✅ v0.1 pip install drt-core[bigquery] Application Default / Service Account Keyfile
DuckDB ✅ v0.1 (core) File path
PostgreSQL ✅ v0.1 pip install drt-core[postgres] Password (env var)
Snowflake ✅ v0.5 pip install drt-core[snowflake] Password (env var)
SQLite ✅ v0.4.2 (core) File path
Redshift ✅ v0.3.4 pip install drt-core[redshift] Password (env var)
ClickHouse ✅ v0.4.3 pip install drt-core[clickhouse] Password (env var)
MySQL ✅ v0.5 pip install drt-core[mysql] Password (env var)
Databricks ✅ v0.6 pip install drt-core[databricks] Access Token (env var)
SQL Server ✅ v0.6 pip install drt-core[sqlserver] Password (env var)
REST API ✅ v0.7 (core) Bearer / API Key / Basic / OAuth2

Destinations

Connector Status Install Auth
REST API ✅ v0.1 (core) Bearer / API Key / Basic / OAuth2
Slack Incoming Webhook ✅ v0.1 (core) Webhook URL
Discord Webhook ✅ v0.4.2 (core) Webhook URL
GitHub Actions ✅ v0.1 (core) Token (env var)
HubSpot ✅ v0.1 (core) Token (env var)
Zendesk ✅ v0.7 (core) Basic (email + API token)
Amplitude ✅ v0.7 (core) Project API key (env var)
Mixpanel ✅ v0.8 (core) Project token / service account
Google Ads ✅ v0.6 (core) OAuth2 Client Credentials
Google Sheets ✅ v0.4 pip install drt-core[sheets] Service Account Keyfile
PostgreSQL (upsert) ✅ v0.4 pip install drt-core[postgres] Password (env var)
MySQL (upsert) ✅ v0.4 pip install drt-core[mysql] Password (env var)
ClickHouse ✅ v0.5 pip install drt-core[clickhouse] Password (env var)
Parquet file ✅ v0.5 pip install drt-core[parquet] File path
Amazon S3 ✅ v0.7.9 pip install drt-core[s3] AWS credential chain / env vars
Google Cloud Storage ✅ v0.7.9 pip install drt-core[gcs] Application Default / Service Account Keyfile
Azure Blob Storage ✅ v0.7.9 pip install drt-core[azure] Connection string env / DefaultAzureCredential
Microsoft Teams Webhook ✅ v0.5 (core) Webhook URL
CSV / JSON / JSONL file ✅ v0.5 (core) File path
Jira ✅ v0.5 (core) Basic (email + API token)
Linear ✅ v0.5 (core) API Key (env var)
SendGrid ✅ v0.5 (core) API Key (env var)
Notion ✅ v0.6 (core) Bearer Token (env var)
Twilio SMS ✅ v0.6 (core) Basic (Account SID + Auth Token)
Intercom ✅ v0.6 (core) Bearer Token (env var)
Email SMTP ✅ v0.6 (core) Username / Password (env var)
Salesforce Bulk API 2.0 ✅ v0.6 (core) OAuth2 (username-password)
Staged Upload ✅ v0.6 (core) Configurable per provider
Elasticsearch / OpenSearch ✅ v0.7.9 (core) API key / HTTP Basic (env var)
Snowflake ✅ v0.7 pip install drt-core[snowflake] Password (env var)
Databricks Delta Lake ✅ v0.7.9 pip install drt-core[databricks] Personal Access Token (env var)
BigQuery ✅ v0.8 pip install drt-core[bigquery] ADC / Service Account keyfile

Integrations

Connector Status Install
Dagster ✅ v0.4 pip install dagster-drt
Prefect ✅ v0.6 (core)
Airflow ✅ v0.6 (core)
dbt manifest reader ✅ v0.4 (core)

Roadmap

Upcoming releases → ROADMAP.md (scope, themes, targets) Issue-level tracking → GitHub Milestones Looking to contribute? → Good First Issues

Shipped:

Version Focus
v0.1 BigQuery / DuckDB / Postgres sources · REST API / Slack / GitHub Actions / HubSpot destinations · CLI · dry-run
v0.2 Incremental sync (cursor_field watermark) · retry config per-sync
v0.3 MCP Server (drt mcp run) · AI Skills for Claude Code · LLM-readable docs · row-level errors · security hardening · Redshift source
v0.4 Google Sheets / PostgreSQL / MySQL destinations · dagster-drt · dbt manifest reader · type safety overhaul
v0.5 Snowflake / MySQL sources · ClickHouse / Parquet / Teams / CSV+JSON / Jira / Linear / SendGrid destinations · drt test · --output json · --profile · ${VAR} substitution · dbt manifest · secrets.toml · Docker
v0.5.4 destination_lookup — resolve FK values by querying destination DB during sync (MySQL / Postgres / ClickHouse)
v0.6 Databricks / SQL Server sources · Notion / Twilio / Intercom / Email SMTP / Salesforce Bulk / Staged Upload destinations · Airflow / Prefect integrations · drt serve · drt sources / drt destinations · --threads parallel execution · --log-format json · --cursor-value · watermark.default_value · test validators (freshness, unique, accepted_values) · JSON Schema validation · GOVERNANCE.md
v0.7 Production Ready — graceful shutdown on SIGTERM/SIGINT · per-destination retry override · sync execution history · zero-downtime atomic table swap · json_columns config · FK existence check (lookups.check_only) · Slack/webhook failure alerts · drt doctor · --quiet flag · drt test --output json / --dry-run · Snowflake destination · GitHub Codespaces playground · OPEN_CORE.md
v0.7.1 drt run --dry-run --diff for record-level preview · tz-aware cursor stringification fix · on_error=fail alignment for Notion / REST API / Email SMTP · VERSIONING.md semver & deprecation policy
v0.7.2 Opt-in anonymous telemetry (PostHog Cloud EU, off by default, allow-list payload, DO_NOT_TRACK honored) · deprecation warnings in drt validate · Postgres destination psycopg2.sql SQL composition
v0.7.3 Patch — Postgres schema-qualified Identifier() composition fix (#442, PR #498): marketing.events was being double-quoted as a single identifier; now correctly composed as separate schema + relation parts
v0.7.4 Patch — MySQL schema-qualified _quote_ident fix (#511, PR #514): mydb.scores now produces `mydb`.`scores` across replace / insert / upsert / row-count paths. PR #514 landed on main two days after the v0.7.3 tag so the drt-core==0.7.3 wheel does not contain it — upgrade to drt-core>=0.7.4 to get the fix
v0.7.5 Production Ready follow-up #3 + Tech Foundation Hardening — closes the Tech Foundation Hardening epic (#538) (11 child issues): CI nightly + publish gate + CodeQL + pip-audit + SBOM · DuckDB E2E harness + boundary tests · ErrorFormatter + drt sources/destinations --detailed + drt init --template UX · SyncObserver engine I/O boundary · destinations serializer + config base class consolidation · cli/main.py split Phase 1. Also ships REST API source polish · sync catalog (#499 P1+P2) · drt_run_test MCP tool · OpenTelemetry Phase 1 config · hardcoded secret detection · orphan shadow cleanup. No new connectors, no breaking changes.
v0.7.6 Small follow-upAmplitude destination (#574, Identify API + HTTP V2 events API) · tojson_safe Jinja2 filter (#580, PR #581) for REST API body_template rendering of datetime / Decimal / UUID columns without CAST(... AS STRING) workarounds · drt run --log-format typer 0.26.1 compatibility fix (#577, PR #578) · ErrorFormatter stage detection retrofitted to engine-emitted attr (PR #571, supersedes #544's traceback-walk heuristic) · cli/main.py split Phase 2a (PR #572). No breaking changes — drop-in upgrade from v0.7.5.
v0.7.7 sync.mode: mirror across the SQL destination set — new differential-delete sync mode that upserts source rows and DELETEs destination rows whose upsert_key was not observed in the source, without the TRUNCATE / re-insert overhead of replace. Lands across Postgres (#596) · MySQL (#597) · ClickHouse (#598) · Snowflake (#599)#340 closed for the SQL set. Also lands the cli/main.py split completion (Phase 2b PR (a) + PR (b) + tighten — 1706 → 164 LOC, -90% from v0.7.5 baseline), FakeSource + destination contract test framework (#592–#595), CI check-changelog-required warn-only guard (#590), GCS storage import mypy fix (#588, reported by @cian-ps), and CI install line extension that unlocked ~102 silently-skipped SQL destination tests (total coverage 82.68 → 85.29). No breaking changes — drop-in upgrade from v0.7.6.
v0.7.8 Community follow-up: Mixpanel destination + ClickHouse identifier fix + empty-batch contract completion — new Mixpanel destination (#608 by @Pawansingh3889, people_set + import_events, EU residency, deterministic insert_id, closes #417). ClickHouse _quote_ident fix (#610 by @yodakanohoshi) — resolves a server-side Code: 62 syntax error on database.table syntax via get_row_count, closing the ClickHouse leg of the qualified-identifier fix family (Postgres #498 / MySQL #514). Empty-batch contract suite complete (#604–#606, 25 of 25 registered destinations) — surfaced + fixed a real bug in staged_upload.finalize(). sync.mode: mirror user-facing docs (#607, postgres.md section + runnable example + skill option), post-#608 Mixpanel wiring (#609), i18n marker bump (#603). No breaking changes — drop-in upgrade from v0.7.7.

Next: v0.8 Cloud Destinations & Growthv0.9 Enterprise Foundationv1.0 Stable Releasev1.x Rust Engine


Orchestration: dagster-drt

Community-maintained Dagster integration. Expose drt syncs as Dagster assets with full observability.

pip install dagster-drt
from dagster import AssetExecutionContext, Definitions
from dagster_drt import drt_assets, DagsterDrtResource

@drt_assets(project_dir="path/to/drt-project")
def my_syncs(context: AssetExecutionContext, drt: DagsterDrtResource):
    yield from drt.run(context=context)

defs = Definitions(
    assets=[my_syncs],
    resources={"drt": DagsterDrtResource(project_dir="path/to/drt-project")},
)

See dagster-drt README for full API docs (Translator, Pipes support, DrtConfig dry-run, MaterializeResult).


CI/CD: GitHub Action

Run drt syncs straight from CI/CD with the official drt-hub/drt-action — no infrastructure, just a few lines of YAML. Trigger on a schedule, on every push, or right after dbt finishes.

- uses: drt-hub/drt-action@v1
  with:
    select: '*'
    extras: postgres
  env:
    PG_PASSWORD: ${{ secrets.PG_PASSWORD }}

Inputs cover sync selection (select), connector extras, profile, dry-run and threads; outputs expose status, succeeded, failed and duration-seconds (plus a step-summary table). Secrets are passed via env: and resolved by drt's *_env keys. See the action README for the secrets pattern and more examples (run-after-dbt, PR preview).


Ecosystem

drt is designed to work alongside, not against, the modern data stack:

drt ecosystem — dlt load, dbt transform, drt activate


Telemetry

drt collects no telemetry by default. Opting in helps us understand which sources / destinations / sync modes are actually used, so we can prioritise.

drt config set telemetry.enabled true     # opt in
drt config show-telemetry                 # preview the exact payload that would be sent
drt config set telemetry.enabled false    # opt out
DO_NOT_TRACK=1 drt run                    # universal kill switch — overrides everything

When opted in, drt sends one sync_completed event per sync. The only properties we collect are these 9 fields: drt_version, python_version, os, source_type, destination_type, sync_mode, rows_synced, duration_seconds, status. The wire envelope additionally carries event, distinct_id (a per-machine random UUID at ~/.drt/.anonymous_id), timestamp, and api_key. Sync names, model SQL, destination URLs, credentials, and project paths are never transmitted — the allow-list is enforced at the function-signature level in drt/telemetry.py. By default events go to PostHog Cloud (EU region); override with DRT_TELEMETRY_ENDPOINT and DRT_TELEMETRY_API_KEY for self-hosted PostHog or a custom collector.

Note: drt itself never transmits your IP, but the receiving PostHog backend records the TCP source IP as $ip. See docs/telemetry.md for details and how to disable / substitute the backend.

For full details see docs/telemetry.md.

Contributing

We welcome contributions of all sizes — from typo fixes to new connectors. drt has a transparent contributor ladder so your work builds toward greater trust and responsibility over time.

Contributors ✨

Thanks goes to these wonderful people (emoji key):

K.Masuda
K.Masuda

💻 📖 🚧 🤔 🔧 📆 👀 🧑‍🏫
yodakanohoshi
yodakanohoshi

💻 📖 🚧 🤔 👀 📆
Moavia Amir
Moavia Amir

💻 📖 🚇 🚧 🤔
Khush Domadiya
Khush Domadiya

💻
Pawan Singh Kapkoti
Pawan Singh Kapkoti

💻
PFCAaron12
PFCAaron12

💻
Semy Ingle
Semy Ingle

💻 🚧
きわみざむらい
きわみざむらい

💻 🐛
armorbreak001
armorbreak001

💻
pureqin
pureqin

💻
Wahaj Ahmed
Wahaj Ahmed

💻
cian-ps
cian-ps

💻 🐛 🚇
Erik Estrella
Erik Estrella

⚠️
Ai (藍)
Ai (藍)

📖
GokulKashyap
GokulKashyap

💻 ⚠️
Add your contributions

Disclaimer

drt is an independent open-source project and is not affiliated with, endorsed by, or sponsored by dbt Labs, dlt-hub, or any other company.

"dbt" is a registered trademark of dbt Labs, Inc. "dlt" is a project maintained by dlt-hub.

drt is designed to complement these tools as part of the modern data stack, but is a separate project with its own codebase and maintainers.

License

Apache 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drt_core-0.7.9.tar.gz (1.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

drt_core-0.7.9-py3-none-any.whl (265.1 kB view details)

Uploaded Python 3

File details

Details for the file drt_core-0.7.9.tar.gz.

File metadata

  • Download URL: drt_core-0.7.9.tar.gz
  • Upload date:
  • Size: 1.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for drt_core-0.7.9.tar.gz
Algorithm Hash digest
SHA256 7d79c3f1e040d897f532e6861bfde6275b82eb43aa28344f86f07e032cdb5e79
MD5 414e53322479b012b71731e3c2c1ae2e
BLAKE2b-256 fa3241aa0270f98b774584e57ce06722a40f2854b3a090f5c73b7ef28d498a45

See more details on using hashes here.

Provenance

The following attestation bundles were made for drt_core-0.7.9.tar.gz:

Publisher: publish-drt-core.yml on drt-hub/drt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file drt_core-0.7.9-py3-none-any.whl.

File metadata

  • Download URL: drt_core-0.7.9-py3-none-any.whl
  • Upload date:
  • Size: 265.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for drt_core-0.7.9-py3-none-any.whl
Algorithm Hash digest
SHA256 7429f8a9d370bb22272c0dc5941c0ffc351bb63a19b5f10519b4b3b1922ca515
MD5 ef2eeb51308dbd403fce5a53cd55b788
BLAKE2b-256 b42bc5224474b2933aed4e8aec1a4d0c5782c73b80d7413e7ad67279eeb29541

See more details on using hashes here.

Provenance

The following attestation bundles were made for drt_core-0.7.9-py3-none-any.whl:

Publisher: publish-drt-core.yml on drt-hub/drt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page