Push dbt lineage to Databricks Unity Catalog
Project description
Push dbt lineage to Databricks Unity Catalog
The Problem
Unity Catalog automatically captures lineage for transformations that run inside Databricks. But it can't see:
- Where data comes from — SAP, Salesforce, PostgreSQL, APIs, etc.
- Where data goes — Power BI dashboards, Tableau reports, applications, etc.
You're left with a gap in your lineage view:
[???] → Bronze → Silver → Gold → [???]
The Solution
dbt already knows this information:
- Sources define upstream systems
- Exposures define downstream consumers
dbt-unity-lineage reads your dbt metadata and pushes it to Unity Catalog:
[SAP] → Bronze → Silver → Gold → [Power BI]
↑ ↑
└── dbt-unity-lineage pushes ──────┘
Installation
pip install dbt-unity-lineage
Quick Start
1. Create a config file
# dbt_unity_lineage.yml
version: 1
source_systems:
sap_ecc:
system_type: SAP
description: SAP ECC Production
salesforce_prod:
system_type: Salesforce
description: Salesforce Sales Cloud
source_paths:
- bronze_erp
- bronze_crm
2. Tag your sources
# models/bronze_erp/_sources.yml
sources:
- name: erp
meta:
uc_source: sap_ecc # ← Just this tag
tables:
- name: gl_accounts
- name: cost_centers
3. Push to Unity Catalog
dbt build
dbt-unity-lineage push
That's it. Check your lineage in Databricks Catalog Explorer.
Exposures: Zero Config
Exposures are read directly from manifest.json. No additional configuration needed.
# models/marts/exposures.yml
exposures:
- name: executive_dashboard
type: dashboard
url: https://app.powerbi.com/groups/abc/reports/xyz
depends_on:
- ref('fct_orders')
The tool automatically:
- Infers
system_type: POWER_BIfrom the URL - Creates external metadata in Unity Catalog
- Links it to your gold tables
CLI Commands
# Push sources and exposures to Unity Catalog
dbt-unity-lineage push
# Preview changes without executing
dbt-unity-lineage push --dry-run
# Show current status (local vs remote)
dbt-unity-lineage status
# Show status in markdown (great for CI/CD)
dbt-unity-lineage status --format md
# Remove orphaned objects
dbt-unity-lineage clean
dbt Cloud Integration
Fetch manifest directly from dbt Cloud instead of requiring a local file:
# Using job ID (fetches latest successful run)
dbt-unity-lineage push \
--dbt-cloud \
--dbt-cloud-account-id 12345 \
--dbt-cloud-job-id 67890
# Using run ID (fetches from specific run)
dbt-unity-lineage push \
--dbt-cloud \
--dbt-cloud-run-id 98765
# With environment variables
export DBT_CLOUD_TOKEN=dbtu_xxx
export DBT_CLOUD_ACCOUNT_ID=12345
dbt-unity-lineage push --dbt-cloud --dbt-cloud-job-id 67890
Global Options
--config PATH # Path to dbt_unity_lineage.yml
--manifest PATH # Path to manifest.json
--project-dir PATH # Path to dbt project directory
--profile NAME # dbt profile name
--target NAME # dbt target name
--verbose # Enable verbose output
--quiet # Suppress non-essential output
--claude # Output Claude AI context (CLAUDE.md)
Claude AI Context
Output version-matched context for Claude AI to understand your dbt-unity-lineage setup:
# Append to your project's CLAUDE.md
dbt-unity-lineage --claude >> CLAUDE.md
# Or to a .claude directory
dbt-unity-lineage --claude >> .claude/CLAUDE.md
This fetches the CLAUDE.md file from GitHub matching your installed version, providing Claude with context about available commands, configuration options, and common patterns.
Configuration Reference
dbt_unity_lineage.yml
version: 1
# Define your source systems
source_systems:
sap_ecc:
system_type: SAP # Required: UC system type
entity_type: table # Optional: defaults to "table"
description: SAP ECC Production # Optional
url: https://sap.example.com # Optional
owner: erp-team@example.com # Optional
properties: # Optional: custom properties
environment: production
# Folders to scan for sources (relative to models/)
source_paths:
- bronze_erp
- bronze_crm
# Optional settings
settings:
batch_size: 50 # API batch size
strict: false # Error on unmapped sources
Source Tagging
In your sources.yml or schema.yml:
sources:
- name: erp
meta:
uc_source: sap_ecc # References source_systems key
tables:
- name: gl_accounts
Exposure Overrides
Exposures work automatically, but you can override the system type:
exposures:
- name: my_dashboard
type: dashboard
url: https://custom-bi-tool.example.com/dashboard/123
meta:
uc_system_type: CUSTOM # Override auto-detection
Supported System Types
The tool normalizes common variations and supports all Unity Catalog system types:
| Input | Normalized |
|---|---|
sap, sap_ecc, sap_hana |
SAP |
salesforce, sfdc |
SALESFORCE |
postgresql, postgres |
POSTGRESQL |
sql_server, mssql |
MICROSOFT_SQL_SERVER |
bigquery, bq |
GOOGLE_BIGQUERY |
powerbi, power_bi |
POWER_BI |
| (and more...) |
Unknown values default to CUSTOM.
URL Auto-Detection
For exposures, system type is automatically detected from URLs:
| URL Contains | System Type |
|---|---|
powerbi.com |
POWER_BI |
tableau.com |
TABLEAU |
looker.com |
LOOKER |
salesforce.com |
SALESFORCE |
CI/CD Integration
GitHub Actions
- name: Push lineage
run: dbt-unity-lineage push --target prod
env:
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
- name: Post status to PR
run: dbt-unity-lineage status --format md >> $GITHUB_STEP_SUMMARY
Status Output (Markdown)
## dbt-unity-lineage Status
| Source | System | Status |
|--------|--------|--------|
| sap_ecc.gl_accounts | SAP | ✅ In sync |
| workday.employees | Workday | 🆕 Create |
| Exposure | System | Status |
|----------|--------|--------|
| executive_dashboard | Power BI | ✅ In sync |
How It Works
Ownership Tracking
Every object created by this tool includes ownership metadata:
{
"properties": {
"managed_by": "dbt-unity-lineage",
"dbt_project": "my_project"
}
}
This ensures:
- Safe updates — Only modifies objects it created
- Multi-project support — Projects don't interfere with each other
- Clean removal — Orphaned objects are tracked and removable
Idempotent Pushes
Run push as many times as you want:
- New objects are created
- Changed objects are updated
- Removed objects are deleted
- Objects from other tools/projects are ignored
Required Permissions
Your Databricks service principal needs:
| Permission | Scope | Purpose |
|---|---|---|
CREATE EXTERNAL METADATA |
Metastore | Create objects |
MODIFY |
External metadata | Update/delete |
Important Notes
Unity Catalog External Lineage is in Public Preview
As of January 2026, this feature is in Public Preview. The API may change. We'll track updates and maintain compatibility.
Profile Configuration
The tool reads connection details from your dbt profiles.yml:
my_project:
target: prod
outputs:
prod:
type: databricks
host: dbc-abc123.cloud.databricks.com
token: "{{ env_var('DATABRICKS_TOKEN') }}"
catalog: main
Related Projects
Contributing
Contributions welcome! Please read our contributing guidelines.
License
Built with the belief that lineage shouldn't stop at your warehouse boundary.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbt_unity_lineage-0.2.0.tar.gz.
File metadata
- Download URL: dbt_unity_lineage-0.2.0.tar.gz
- Upload date:
- Size: 71.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac26385e0802164a2a243228b9f71d9af707293fbb4bec74d39a19f94aef6a12
|
|
| MD5 |
4b4a90a28b6b6ebffabee5d62dcb84a6
|
|
| BLAKE2b-256 |
3fd0ee61b648f95eb94ee4baae3f98c550a9a5aa339cff3fb0415764dca8f5ca
|
File details
Details for the file dbt_unity_lineage-0.2.0-py3-none-any.whl.
File metadata
- Download URL: dbt_unity_lineage-0.2.0-py3-none-any.whl
- Upload date:
- Size: 34.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c83de1f89f98295d0e373db1f3c2b3f4f7be58acba4232d40ab139991e15bc2a
|
|
| MD5 |
f9b7109de0b6ded1672a0da10a3a264d
|
|
| BLAKE2b-256 |
76e58bfcb430d8c6bc0a94c5119afa81c92fcddb2d458f3840f1613f9c4f049e
|