Generate beautiful, interactive column-level lineage for dbt projects
Project description
miswag-dbt-lineage
๐ Generate beautiful, interactive column-level lineage for your dbt projects
miswag-dbt-lineage is a lightweight, dbt-native tool that generates a static website with interactive column-level lineage visualization. No backend, no serversโjust beautiful, deployable lineage documentation.
โจ Features
- ๐ Column-level lineage โ trace data flow through transformations
- ๐ Table-level lineage โ visualize model dependencies
- ๐จ Interactive visualization โ pan, zoom, and explore your data pipelines
- ๐ Static output โ deploy to S3, GCS, GitHub Pages, or any static host
- ๐ฏ dbt-native โ works with your existing dbt artifacts (no code changes needed)
- โก Fast โ handles 1000+ models and 10,000+ columns
- ๐ Beautiful UI โ dark theme, color-coded layers, transformation indicators
๐ฏ What It Does
- Reads your dbt artifacts (
manifest.json,catalog.json) - Extracts column-level lineage using SQL parsing (powered by sqlglot)
- Generates a static website with an interactive lineage explorer
- Deploys anywhere โ S3, GCS, Azure Blob, GitHub Pages, etc.
๐ฆ Installation
pip install miswag-dbt-lineage
Or install from source:
git clone https://github.com/hameeddataeng/miswag-dbt-lineage.git
cd miswag-dbt-lineage
pip install -e .
๐ Quick Start
Basic Usage
# Navigate to your dbt project
cd my-dbt-project
# Generate lineage site (output defaults to target/lineage_website)
miswag-dbt-lineage generate \
--manifest target/manifest.json \
--catalog target/catalog.json
All-in-One Build
# Runs 'dbt docs generate' + generates lineage site (output defaults to target/lineage_website)
miswag-dbt-lineage build
View Locally
cd target/lineage_website
python -m http.server 8080
# Open http://localhost:8080
๐ Usage
Commands
generate โ Generate lineage site from artifacts
miswag-dbt-lineage generate [OPTIONS]
Options:
--manifest, -m PATHโ Path to manifest.json (default:target/manifest.json)--catalog, -c PATHโ Path to catalog.json (optional but recommended)--output, -o PATHโ Output directory (default:target/lineage_website)--dialect, -d TEXTโ SQL dialect:clickhouse,postgres,snowflake,bigquery, etc. (default:clickhouse)--verboseโ Enable verbose logging--helpโ Show help
Example:
miswag-dbt-lineage generate \
--manifest target/manifest.json \
--catalog target/catalog.json \
--output docs/lineage \
--dialect snowflake
build โ Build lineage (runs dbt docs + generate)
miswag-dbt-lineage build [OPTIONS]
Options:
--project-dir, -p PATHโ dbt project directory (default:.)--output, -o PATHโ Output directory (default:target/lineage_website)--skip-dbt-docsโ Skip runningdbt docs generate--dialect, -d TEXTโ SQL dialect (default:clickhouse)--helpโ Show help
Example:
miswag-dbt-lineage build --dialect postgres
Supported SQL Dialects
clickhouse(default)postgressnowflakebigqueryredshiftdatabricksmysqltsql(SQL Server)- And more โ see sqlglot docs
๐ Deployment
The generated site is a fully static collection of HTML/CSS/JS files. Deploy it anywhere:
AWS S3
aws s3 sync target/lineage_website s3://my-bucket/lineage-docs/
aws s3 website s3://my-bucket --index-document index.html
Google Cloud Storage
gsutil -m rsync -r target/lineage_website gs://my-bucket/lineage-docs/
gsutil web set -m index.html gs://my-bucket
Azure Blob Storage
az storage blob upload-batch \
--account-name mystorageaccount \
--destination '$web' \
--source target/lineage_website
GitHub Pages
# Push to gh-pages branch
cd target/lineage_website
git init
git checkout -b gh-pages
git add .
git commit -m "Deploy lineage site"
git remote add origin https://github.com/your-org/your-repo.git
git push -f origin gh-pages
๐จ Features Walkthrough
Table Lineage
- โ Visualize upstream & downstream model dependencies
- โ Color-coded layers (source, staging, intermediate, mart, seed)
- โ Click any model to see its lineage
- โ Inline model metadata (layer, materialization, columns, tests, deps)
- โ Adjustable depth (1-5 levels)
Column Lineage
- โ Trace column-to-column data flow
- โ Transformation type indicators (DIRECT, RENAMED, FUNCTION, CASE, AGG, CALC)
- โ Color-coded edges for transformation types
- โ Inline column metadata (name, type, model, transformation SQL)
- โ Click any column to pivot to its lineage
- โ Adjustable depth (1-5 levels)
Catalog Views
- โ Models โ browse all models with metadata
- โ Sources โ view all data sources
- โ Tests โ see all data quality tests
- โ Search and filter by layer, directory, etc.
๐ ๏ธ How It Works
Architecture
dbt artifacts โ SQL parsing โ Lineage graph โ Static website
โ โ โ โ
manifest.json sqlglot lineage.json index.html
catalog.json + data/
Lineage Resolution
- Read dbt artifacts โ Parse
manifest.jsonandcatalog.json - Extract dependencies โ Identify model โ model relationships
- Parse compiled SQL โ Use sqlglot to analyze SELECT statements
- Resolve columns โ Match columns across CTEs, aliases, and transformations
- Classify transformations โ Detect aggregations, functions, CASE expressions, etc.
- Generate graph โ Build node/edge graph with metadata
- Create static site โ Bundle HTML + JSON for deployment
๐ Configuration
Layer Classification
By default, models are classified into layers based on naming conventions:
- source:
source.* - staging:
.stg_,staging - intermediate:
.int_,intermediate - mart:
.mart,.fct_,.dim_,marts - seed:
seed.*
You can customize this in the extractor code (miswag_dbt_lineage/extractor.py).
๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Development Setup
# Clone repo
git clone https://github.com/hameeddataeng/miswag-dbt-lineage.git
cd miswag-dbt-lineage
# Install in editable mode with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Format code
black .
ruff check .
๐ License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
๐ Acknowledgments
- Built for the dbt community
- Powered by sqlglot for SQL parsing
- Inspired by dbt docs and various lineage visualization tools
๐ง Contact
- Author: Hameed Mahmood
- GitHub: hameeddataeng/miswag-dbt-lineage
- PyPI: miswag-dbt-lineage
- Issues: Report a bug
โญ If you find this useful, please star the repo!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file miswag_dbt_lineage-0.1.5.tar.gz.
File metadata
- Download URL: miswag_dbt_lineage-0.1.5.tar.gz
- Upload date:
- Size: 36.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a106e3c9c36b150fa5cf321e00a7fc3dbe25cf84f3fa50657c71aa53489b18b
|
|
| MD5 |
b7b8cce30f8f0ef4ec0c2c0d33cea9ed
|
|
| BLAKE2b-256 |
2764f3f7821ec5169392d7616635ea7132df53e5fbe483aa0925412f09ab39b8
|
File details
Details for the file miswag_dbt_lineage-0.1.5-py3-none-any.whl.
File metadata
- Download URL: miswag_dbt_lineage-0.1.5-py3-none-any.whl
- Upload date:
- Size: 32.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
620b63899975f77be18b2904b0e72544ea283c64055d844ce4aff0340d4985e4
|
|
| MD5 |
c78078dac4847400f7c516cb887500eb
|
|
| BLAKE2b-256 |
770906403c187e50e1d86c270ad40bd59966303b739fdf0f34b3b00cdc811568
|