Skip to main content

Reverse ETL for the code-first data stack

Project description

English | 日本語

drt logo

drt — data reverse tool

Reverse ETL for the code-first data stack.

CI codecov PyPI drt-core downloads dagster-drt downloads License Python GitHub Sponsors Open in GitHub Codespaces

All Contributors

drt syncs data from your data warehouse to external services — declaratively, via YAML and CLI. Think dbt rundrt run. Same developer experience, opposite data direction.

drt quickstart demo

pip install drt-core          # core (DuckDB included)
drt init && drt run

Why drt?

Problem drt's answer
Census/Hightouch are expensive SaaS Free, self-hosted OSS
GUI-first tools don't fit CI/CD CLI + YAML, Git-native
dbt/dlt ecosystem has no reverse leg Same philosophy, same DX
LLM/MCP era makes GUI SaaS overkill LLM-native by design

What's always free? All connectors, CLI, MCP server, and sync engine. See OPEN_CORE.md for the open core boundary.


Quickstart

No cloud accounts needed — runs locally with DuckDB in about 5 minutes.

1. Install

pip install drt-core

For cloud sources: pip install drt-core[bigquery], drt-core[postgres], etc.

2. Set up a project

mkdir my-drt-project && cd my-drt-project
drt init   # select "duckdb" as source

3. Create sample data

python -c "
import duckdb
c = duckdb.connect('warehouse.duckdb')
c.execute('''CREATE TABLE IF NOT EXISTS users AS SELECT * FROM (VALUES
  (1, 'Alice', 'alice@example.com'),
  (2, 'Bob',   'bob@example.com'),
  (3, 'Carol', 'carol@example.com')
) t(id, name, email)''')
c.close()
"

4. Create a sync

# syncs/post_users.yml
name: post_users
description: "POST user records to an API"
model: ref('users')
destination:
  type: rest_api
  url: "https://httpbin.org/post"
  method: POST
  headers:
    Content-Type: "application/json"
  body_template: |
    { "id": {{ row.id }}, "name": "{{ row.name }}", "email": "{{ row.email }}" }
sync:
  mode: full
  batch_size: 1
  on_error: fail

5. Run

drt run --dry-run   # preview, no data sent
drt run             # run for real
drt status          # check results

See examples/ for more: Slack, Google Sheets, HubSpot, GitHub Actions, etc.


CLI Reference

drt init                    # initialize project
drt list                    # list sync definitions
drt sources                 # list available source connectors
drt destinations            # list available destination connectors
drt run                     # run all syncs
drt run --select <name>     # run a specific sync
drt run --all               # discover and run all syncs
drt run --select tag:<tag>  # run syncs matching a tag
drt run --threads 4         # parallel sync execution
drt run --dry-run           # dry run
drt run --verbose           # show row-level error details
drt run --output json       # structured JSON output for CI/scripting
drt run --log-format json   # structured JSON logging to stderr
drt run --profile prd       # override profile (or DRT_PROFILE env var)
drt run --cursor-value '…'  # override watermark cursor for backfill
drt test                    # run post-sync validation tests
drt test --select <name>    # test a specific sync
drt validate                # validate sync YAML configs
drt status                  # show recent sync status
drt status --output json    # JSON output for status
drt serve                   # start HTTP webhook endpoint
drt mcp run                 # start MCP server (requires drt-core[mcp])
drt --install-completion    # install shell completion (bash/zsh/fish)
drt --show-completion       # show completion script

Shell completion

Shell completion is supported for bash, zsh, and fish:

# Recommended: auto-install for your current shell (idempotent)
drt --install-completion

# Or manually add to your shell config (run once from the target shell)
drt --show-completion >> ~/.bashrc   # bash
drt --show-completion >> ~/.zshrc    # zsh
drt --show-completion > ~/.config/fish/completions/drt.fish  # fish

Note: --show-completion outputs the script for your current shell. Run it from the shell you want to configure. The manual >> append is not idempotent — run it once only.

After installation, restart your shell and tab-complete commands and options.


MCP Server

Connect drt to Claude, Cursor, or any MCP-compatible client so you can run syncs, check status, and validate configs without leaving your AI environment.

pip install drt-core[mcp]
drt mcp run

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "drt": {
      "command": "drt",
      "args": ["mcp", "run"]
    }
  }
}

Available MCP tools:

Tool What it does
drt_list_syncs List all sync definitions
drt_run_sync Run a sync (supports dry_run)
drt_get_status Get last run result(s)
drt_validate Validate sync YAML configs
drt_get_schema Return JSON Schema for config files
drt_list_connectors List available sources and destinations

AI Skills for Claude Code

Install the official Claude Code skills to generate YAML, debug failures, and migrate from other tools — all from the chat interface.

Install via Plugin Marketplace (recommended)

/plugin marketplace add drt-hub/drt
/plugin install drt@drt-hub

Tip: Enable auto-update so you always get the latest skills when drt is updated: /plugin → Marketplaces → drt-hub → Enable auto-update

Manual install (slash commands)

Copy the files from .claude/commands/ into your drt project's .claude/commands/ directory.

Skill Trigger What it does
/drt-create-sync "create a sync" Generates valid sync YAML from your intent
/drt-debug "sync failed" Diagnoses errors and suggests fixes
/drt-init "set up drt" Guides through project initialization
/drt-migrate "migrate from Census" Converts existing configs to drt YAML

Connectors

Sources

Connector Status Install Auth
BigQuery ✅ v0.1 pip install drt-core[bigquery] Application Default / Service Account Keyfile
DuckDB ✅ v0.1 (core) File path
PostgreSQL ✅ v0.1 pip install drt-core[postgres] Password (env var)
Snowflake ✅ v0.5 pip install drt-core[snowflake] Password (env var)
SQLite ✅ v0.4.2 (core) File path
Redshift ✅ v0.3.4 pip install drt-core[redshift] Password (env var)
ClickHouse ✅ v0.4.3 pip install drt-core[clickhouse] Password (env var)
MySQL ✅ v0.5 pip install drt-core[mysql] Password (env var)
Databricks ✅ v0.6 pip install drt-core[databricks] Access Token (env var)
SQL Server ✅ v0.6 pip install drt-core[sqlserver] Password (env var)

Destinations

Connector Status Install Auth
REST API ✅ v0.1 (core) Bearer / API Key / Basic / OAuth2
Slack Incoming Webhook ✅ v0.1 (core) Webhook URL
Discord Webhook ✅ v0.4.2 (core) Webhook URL
GitHub Actions ✅ v0.1 (core) Token (env var)
HubSpot ✅ v0.1 (core) Token (env var)
Google Ads ✅ v0.6 (core) OAuth2 Client Credentials
Google Sheets ✅ v0.4 pip install drt-core[sheets] Service Account Keyfile
PostgreSQL (upsert) ✅ v0.4 pip install drt-core[postgres] Password (env var)
MySQL (upsert) ✅ v0.4 pip install drt-core[mysql] Password (env var)
ClickHouse ✅ v0.5 pip install drt-core[clickhouse] Password (env var)
Parquet file ✅ v0.5 pip install drt-core[parquet] File path
Microsoft Teams Webhook ✅ v0.5 (core) Webhook URL
CSV / JSON / JSONL file ✅ v0.5 (core) File path
Jira ✅ v0.5 (core) Basic (email + API token)
Linear ✅ v0.5 (core) API Key (env var)
SendGrid ✅ v0.5 (core) API Key (env var)
Notion ✅ v0.6 (core) Bearer Token (env var)
Twilio SMS ✅ v0.6 (core) Basic (Account SID + Auth Token)
Intercom ✅ v0.6 (core) Bearer Token (env var)
Email SMTP ✅ v0.6 (core) Username / Password (env var)
Salesforce Bulk API 2.0 ✅ v0.6 (core) OAuth2 (username-password)
Staged Upload ✅ v0.6 (core) Configurable per provider
Snowflake ✅ v0.7 pip install drt-core[snowflake] Password (env var)

Integrations

Connector Status Install
Dagster ✅ v0.4 pip install dagster-drt
Prefect ✅ v0.6 (core)
Airflow ✅ v0.6 (core)
dbt manifest reader ✅ v0.4 (core)

Roadmap

Upcoming releases → ROADMAP.md (scope, themes, targets) Issue-level tracking → GitHub Milestones Looking to contribute? → Good First Issues

Shipped:

Version Focus
v0.1 BigQuery / DuckDB / Postgres sources · REST API / Slack / GitHub Actions / HubSpot destinations · CLI · dry-run
v0.2 Incremental sync (cursor_field watermark) · retry config per-sync
v0.3 MCP Server (drt mcp run) · AI Skills for Claude Code · LLM-readable docs · row-level errors · security hardening · Redshift source
v0.4 Google Sheets / PostgreSQL / MySQL destinations · dagster-drt · dbt manifest reader · type safety overhaul
v0.5 Snowflake / MySQL sources · ClickHouse / Parquet / Teams / CSV+JSON / Jira / Linear / SendGrid destinations · drt test · --output json · --profile · ${VAR} substitution · dbt manifest · secrets.toml · Docker
v0.5.4 destination_lookup — resolve FK values by querying destination DB during sync (MySQL / Postgres / ClickHouse)
v0.6 Databricks / SQL Server sources · Notion / Twilio / Intercom / Email SMTP / Salesforce Bulk / Staged Upload destinations · Airflow / Prefect integrations · drt serve · drt sources / drt destinations · --threads parallel execution · --log-format json · --cursor-value · watermark.default_value · test validators (freshness, unique, accepted_values) · JSON Schema validation · GOVERNANCE.md
v0.7 Production Ready — graceful shutdown on SIGTERM/SIGINT · per-destination retry override · sync execution history · zero-downtime atomic table swap · json_columns config · FK existence check (lookups.check_only) · Slack/webhook failure alerts · drt doctor · --quiet flag · drt test --output json / --dry-run · Snowflake destination · GitHub Codespaces playground · OPEN_CORE.md
v0.7.1 drt run --dry-run --diff for record-level preview · tz-aware cursor stringification fix · on_error=fail alignment for Notion / REST API / Email SMTP · VERSIONING.md semver & deprecation policy
v0.7.2 Opt-in anonymous telemetry (PostHog Cloud EU, off by default, allow-list payload, DO_NOT_TRACK honored) · deprecation warnings in drt validate · Postgres destination psycopg2.sql SQL composition

Next: v0.8 Cloud Destinations & Growthv0.9 Enterprise Foundationv1.0 Stable Releasev1.x Rust Engine


Orchestration: dagster-drt

Community-maintained Dagster integration. Expose drt syncs as Dagster assets with full observability.

pip install dagster-drt
from dagster import AssetExecutionContext, Definitions
from dagster_drt import drt_assets, DagsterDrtResource

@drt_assets(project_dir="path/to/drt-project")
def my_syncs(context: AssetExecutionContext, drt: DagsterDrtResource):
    yield from drt.run(context=context)

defs = Definitions(
    assets=[my_syncs],
    resources={"drt": DagsterDrtResource(project_dir="path/to/drt-project")},
)

See dagster-drt README for full API docs (Translator, Pipes support, DrtConfig dry-run, MaterializeResult).


Ecosystem

drt is designed to work alongside, not against, the modern data stack:

drt ecosystem — dlt load, dbt transform, drt activate


Telemetry

drt collects no telemetry by default. Opting in helps us understand which sources / destinations / sync modes are actually used, so we can prioritise.

drt config set telemetry.enabled true     # opt in
drt config show-telemetry                 # preview the exact payload that would be sent
drt config set telemetry.enabled false    # opt out
DO_NOT_TRACK=1 drt run                    # universal kill switch — overrides everything

When opted in, drt sends one sync_completed event per sync. The only properties we collect are these 9 fields: drt_version, python_version, os, source_type, destination_type, sync_mode, rows_synced, duration_seconds, status. The wire envelope additionally carries event, distinct_id (a per-machine random UUID at ~/.drt/.anonymous_id), timestamp, and api_key. Sync names, model SQL, destination URLs, credentials, and project paths are never transmitted — the allow-list is enforced at the function-signature level in drt/telemetry.py. By default events go to PostHog Cloud (EU region); override with DRT_TELEMETRY_ENDPOINT and DRT_TELEMETRY_API_KEY for self-hosted PostHog or a custom collector.

Note: drt itself never transmits your IP, but the receiving PostHog backend records the TCP source IP as $ip. See docs/telemetry.md for details and how to disable / substitute the backend.

For full details see docs/telemetry.md.

Contributing

We welcome contributions of all sizes — from typo fixes to new connectors. drt has a transparent contributor ladder so your work builds toward greater trust and responsibility over time.

Contributors ✨

Thanks goes to these wonderful people (emoji key):

K.Masuda
K.Masuda

💻
Moavia Amir
Moavia Amir

💻
Khush Domadiya
Khush Domadiya

💻
Pawan Singh Kapkoti
Pawan Singh Kapkoti

💻
PFCAaron12
PFCAaron12

💻
armorbreak001
armorbreak001

💻
pureqin
pureqin

💻
Wahaj Ahmed
Wahaj Ahmed

💻
cian-ps
cian-ps

💻
Erik Estrella
Erik Estrella

💻
Ai (藍)
Ai (藍)

💻
GokulKashyap
GokulKashyap

💻 ⚠️
Add your contributions

Disclaimer

drt is an independent open-source project and is not affiliated with, endorsed by, or sponsored by dbt Labs, dlt-hub, or any other company.

"dbt" is a registered trademark of dbt Labs, Inc. "dlt" is a project maintained by dlt-hub.

drt is designed to complement these tools as part of the modern data stack, but is a separate project with its own codebase and maintainers.

License

Apache 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drt_core-0.7.2.tar.gz (1.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

drt_core-0.7.2-py3-none-any.whl (152.9 kB view details)

Uploaded Python 3

File details

Details for the file drt_core-0.7.2.tar.gz.

File metadata

  • Download URL: drt_core-0.7.2.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for drt_core-0.7.2.tar.gz
Algorithm Hash digest
SHA256 da0e6e331af2068a8d4199d7be9c87d4659a2ca02630f9141de30af2bbb92db1
MD5 30451e6067ed36a1e23a17f88cba106e
BLAKE2b-256 393f88447f5dbe78cbb4afd2f998fa4c22b1ad67a470e835e6d19ef0e8576f9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for drt_core-0.7.2.tar.gz:

Publisher: publish-drt-core.yml on drt-hub/drt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file drt_core-0.7.2-py3-none-any.whl.

File metadata

  • Download URL: drt_core-0.7.2-py3-none-any.whl
  • Upload date:
  • Size: 152.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for drt_core-0.7.2-py3-none-any.whl
Algorithm Hash digest
SHA256 cf7be22a9209a1815088383d2f262ab0ada2dce2e5d62c7d1e14d5e2c50375d7
MD5 0fbee74e648120d6e35cd13054661b4f
BLAKE2b-256 aaa5bebc53974c9f9f0efd1a146c695247df5c36e995a7f282412c7970ddb148

See more details on using hashes here.

Provenance

The following attestation bundles were made for drt_core-0.7.2-py3-none-any.whl:

Publisher: publish-drt-core.yml on drt-hub/drt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page