Skip to main content

A simple tool to extract dbt column lineage

Project description

DBT Column Lineage

Tests

📖 Documentation | 🐛 Report Bug | 💡 Request Feature

Overview

DBT Column Lineage is a simple tool that helps you visualize and understand column-level data lineage in your dbt projects. It relies on dbt artifacts (manifest & catalog) and compiled sql parsing (to work as expected, it's mandatory to compile your project / run a dbt docs generate for catalog generation).

The tool offers several ways to view lineage:

  • Interactive Explorer: A local web server providing an interactive UI to explore model and column lineage visually. (Recommended)
  • DOT: Generates GraphViz dot files that can be rendered as static images.
  • Text: Simple console output showing upstream and downstream dependencies for a specific model or column.

DBT Column Lineage Demo - Concept (Note: The demo shows the lineage concept; the interactive explorer provides an enhanced UI.)

Installation

pip install dbt-col-lineage==0.3.0

Usage

First, ensure your dbt project is compiled and you have generated the catalog:

dbt compile
dbt docs generate

1. Interactive Exploration (Recommended)

To start the interactive lineage explorer:

dbt-col-lineage --explore \
    --manifest path/to/manifest.json \
    --catalog path/to/catalog.json \
    --port 8080  # Optional port selection

This will start a server (defaulting to port 8000). Open your web browser to the specified address (e.g., http://127.0.0.1:8080). You can then select models and columns from the sidebar to visualize their lineage directly in the UI.

2. Static Output (Text or DOT)

To generate lineage for a specific model or column directly in the terminal or as a DOT file:

dbt-col-lineage --select '[+]model_name[.column_name][+]' \
    --manifest path/to/manifest.json \
    --catalog path/to/catalog.json \
    --format [text|dot] \
    --output filename

Examples:

# Downstream lineage for stg_transactions.amount in text format
dbt-col-lineage --select stg_transactions.amount+ --format text

# Upstream lineage for stg_accounts.id as a DOT file
dbt-col-lineage --select +stg_accounts.id --format dot --output upstream_account_id.dot

# Both directions for stg_orders in text format
dbt-col-lineage --select stg_orders --format text

Options

  • --explore: Starts the interactive web server for exploring lineage. Cannot be used with --select or --format.
  • --select: Specify model/column for static analysis using the format [+]model_name[.column_name][+]. Cannot be used with --explore.
    • Add + suffix for downstream lineage (e.g., stg_accounts.id+)
    • Add + prefix for upstream lineage (e.g., +stg_accounts.id)
    • No + for both directions (e.g., stg_accounts.id or stg_accounts for model-level)
  • --catalog: Path to the dbt catalog file (default: target/catalog.json)
  • --manifest: Path to the dbt manifest file (default: target/manifest.json)
  • --format, -f: Output format for static analysis (text or dot). Not used with --explore. (default: text)
  • --output, -o: Output filename for dot format (without extension, default: lineage). Not used with --explore.
  • --port, -p: Port for the interactive web server when using --explore (default: 8000).
  • --adapter: Override the SQL dialect used by the parser (sqlglot dialect name, e.g., tsql, snowflake, bigquery). When provided, this overrides the adapter detected from the dbt manifest.

Limitations

  • Doesn't support python models
  • Some functions/syntax cannot be parsed properly, leading to models being skipped

Compatibility

The tool has been tested with the following dbt adapters:

  • Snowflake
  • SQLite
  • DuckDB
  • MS SQLServer / TSQL

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_col_lineage-0.4.0.tar.gz (130.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_col_lineage-0.4.0-py3-none-any.whl (139.2 kB view details)

Uploaded Python 3

File details

Details for the file dbt_col_lineage-0.4.0.tar.gz.

File metadata

  • Download URL: dbt_col_lineage-0.4.0.tar.gz
  • Upload date:
  • Size: 130.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.9.24 Linux/6.11.0-1018-azure

File hashes

Hashes for dbt_col_lineage-0.4.0.tar.gz
Algorithm Hash digest
SHA256 767c7b4271f3026863a060187a0a230106fd32001e41645053760c9fbc9b2561
MD5 d1db2511f45331c0ae63b57092d7dde5
BLAKE2b-256 289aa0be959fb58cec6d5b028479f65ff2ce7417fa31a1b9cb7a28fcbf237573

See more details on using hashes here.

File details

Details for the file dbt_col_lineage-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: dbt_col_lineage-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 139.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.9.24 Linux/6.11.0-1018-azure

File hashes

Hashes for dbt_col_lineage-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9f3801ac453fb12c1b115903a507abe2d4b9ebe91fb8403c62e28aa64fd74098
MD5 3dbaad295a49d46b09a9f092b425fd4e
BLAKE2b-256 98c514423f17ea9a597c48a8384f826d0dfebb1bbe12440285e031dfaf8c208c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page