Skip to main content

A simple tool to extract dbt column lineage

Project description

DBT Column Lineage

Tests

Overview

DBT Column Lineage is a simple tool that helps you visualize and understand column-level data lineage in your dbt projects. It relies on dbt artifacts (manifest & catalog) and compiled sql parsing (to work as expected, it's mandatory to compile your project / run a dbt docs generate for catalog generation).

The tool offers several ways to view lineage:

  • Interactive Explorer: A local web server providing an interactive UI to explore model and column lineage visually. (Recommended)
  • DOT: Generates GraphViz dot files that can be rendered as static images.
  • Text: Simple console output showing upstream and downstream dependencies for a specific model or column.

DBT Column Lineage Demo - Concept (Note: The demo shows the lineage concept; the interactive explorer provides an enhanced UI.)

Installation

pip install dbt-col-lineage==0.3.0

Usage

First, ensure your dbt project is compiled and you have generated the catalog:

dbt compile
dbt docs generate

1. Interactive Exploration (Recommended)

To start the interactive lineage explorer:

dbt-col-lineage --explore \
    --manifest path/to/manifest.json \
    --catalog path/to/catalog.json \
    --port 8080  # Optional port selection

This will start a server (defaulting to port 8000). Open your web browser to the specified address (e.g., http://127.0.0.1:8080). You can then select models and columns from the sidebar to visualize their lineage directly in the UI.

2. Static Output (Text or DOT)

To generate lineage for a specific model or column directly in the terminal or as a DOT file:

dbt-col-lineage --select '[+]model_name[.column_name][+]' \
    --manifest path/to/manifest.json \
    --catalog path/to/catalog.json \
    --format [text|dot] \
    --output filename

Examples:

# Downstream lineage for stg_transactions.amount in text format
dbt-col-lineage --select stg_transactions.amount+ --format text

# Upstream lineage for stg_accounts.id as a DOT file
dbt-col-lineage --select +stg_accounts.id --format dot --output upstream_account_id.dot

# Both directions for stg_orders in text format
dbt-col-lineage --select stg_orders --format text

Options

  • --explore: Starts the interactive web server for exploring lineage. Cannot be used with --select or --format.
  • --select: Specify model/column for static analysis using the format [+]model_name[.column_name][+]. Cannot be used with --explore.
    • Add + suffix for downstream lineage (e.g., stg_accounts.id+)
    • Add + prefix for upstream lineage (e.g., +stg_accounts.id)
    • No + for both directions (e.g., stg_accounts.id or stg_accounts for model-level)
  • --catalog: Path to the dbt catalog file (default: target/catalog.json)
  • --manifest: Path to the dbt manifest file (default: target/manifest.json)
  • --format, -f: Output format for static analysis (text or dot). Not used with --explore. (default: text)
  • --output, -o: Output filename for dot format (without extension, default: lineage). Not used with --explore.
  • --port, -p: Port for the interactive web server when using --explore (default: 8000).

Limitations

  • Doesn't support python models
  • Some functions/syntax cannot be parsed properly, leading to models being skipped

Compatibility

The tool has been tested with the following dbt adapters:

  • Snowflake
  • SQLite
  • DuckDB

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_col_lineage-0.3.1.tar.gz (125.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_col_lineage-0.3.1-py3-none-any.whl (133.8 kB view details)

Uploaded Python 3

File details

Details for the file dbt_col_lineage-0.3.1.tar.gz.

File metadata

  • Download URL: dbt_col_lineage-0.3.1.tar.gz
  • Upload date:
  • Size: 125.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.9.21 Linux/6.8.0-1021-azure

File hashes

Hashes for dbt_col_lineage-0.3.1.tar.gz
Algorithm Hash digest
SHA256 3f035baf07c64ee41b5d43ebe4c80b9f2852b6c1a1f35877c84bbfdf331a013e
MD5 7007bae52ea4653b48046ddb9e9d5c63
BLAKE2b-256 60d2e85aab7e231c8cc4daae045dbe1e99166f9e1357fa436d74c92d3d00904a

See more details on using hashes here.

File details

Details for the file dbt_col_lineage-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: dbt_col_lineage-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 133.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.9.21 Linux/6.8.0-1021-azure

File hashes

Hashes for dbt_col_lineage-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 91618a2556e7c89de9cbd9915321bea787c65eeb447994c09ce8eba9fde3ed81
MD5 8372cb661f7e478995c154cd15f8587e
BLAKE2b-256 0288218947240f7c2b5bd16616ec9409254d63a7cc71c07a27368c12eccb2b1f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page