Skip to main content

A simple tool to extract dbt column lineage

Project description

DBT Column Lineage

Tests

Overview

DBT Column Lineage is a simple tool that helps you visualize and understand column-level data lineage in your dbt projects. It relies on dbt artifacts (manifest & catalog) and compiled sql parsing (to work as expected, it's mandatory to compile your project / run a dbt docs generate for catalog generation).

The tool offers several ways to view lineage:

  • Interactive Explorer: A local web server providing an interactive UI to explore model and column lineage visually. (Recommended)
  • DOT: Generates GraphViz dot files that can be rendered as static images.
  • Text: Simple console output showing upstream and downstream dependencies for a specific model or column.

DBT Column Lineage Demo - Concept (Note: The demo shows the lineage concept; the interactive explorer provides an enhanced UI.)

Installation

pip install dbt-col-lineage==0.2.0

Usage

First, ensure your dbt project is compiled and you have generated the catalog:

dbt compile
dbt docs generate

1. Interactive Exploration (Recommended)

To start the interactive lineage explorer:

dbt-col-lineage --explore \
    --manifest path/to/manifest.json \
    --catalog path/to/catalog.json \
    --port 8080  # Optional port selection

This will start a server (defaulting to port 8000). Open your web browser to the specified address (e.g., http://127.0.0.1:8080). You can then select models and columns from the sidebar to visualize their lineage directly in the UI.

2. Static Output (Text or DOT)

To generate lineage for a specific model or column directly in the terminal or as a DOT file:

dbt-col-lineage --select '[+]model_name[.column_name][+]' \
    --manifest path/to/manifest.json \
    --catalog path/to/catalog.json \
    --format [text|dot] \
    --output filename

Examples:

# Downstream lineage for stg_transactions.amount in text format
dbt-col-lineage --select stg_transactions.amount+ --format text

# Upstream lineage for stg_accounts.id as a DOT file
dbt-col-lineage --select +stg_accounts.id --format dot --output upstream_account_id.dot

# Both directions for stg_orders in text format
dbt-col-lineage --select stg_orders --format text

Options

  • --explore: Starts the interactive web server for exploring lineage. Cannot be used with --select or --format.
  • --select: Specify model/column for static analysis using the format [+]model_name[.column_name][+]. Cannot be used with --explore.
    • Add + suffix for downstream lineage (e.g., stg_accounts.id+)
    • Add + prefix for upstream lineage (e.g., +stg_accounts.id)
    • No + for both directions (e.g., stg_accounts.id or stg_accounts for model-level)
  • --catalog: Path to the dbt catalog file (default: target/catalog.json)
  • --manifest: Path to the dbt manifest file (default: target/manifest.json)
  • --format, -f: Output format for static analysis (text or dot). Not used with --explore. (default: text)
  • --output, -o: Output filename for dot format (without extension, default: lineage). Not used with --explore.
  • --port, -p: Port for the interactive web server when using --explore (default: 8000).

Limitations

  • Doesn't support python models
  • Some functions/syntax cannot be parsed properly, leading to models being skipped

Compatibility

The tool has been tested with the following dbt adapters:

  • Snowflake
  • SQLite
  • DuckDB

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_col_lineage-0.2.0.tar.gz (124.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_col_lineage-0.2.0-py3-none-any.whl (133.3 kB view details)

Uploaded Python 3

File details

Details for the file dbt_col_lineage-0.2.0.tar.gz.

File metadata

  • Download URL: dbt_col_lineage-0.2.0.tar.gz
  • Upload date:
  • Size: 124.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.9.21 Linux/6.8.0-1021-azure

File hashes

Hashes for dbt_col_lineage-0.2.0.tar.gz
Algorithm Hash digest
SHA256 81a730e138bf84b9e3558023f3a1b8d5ba1c396f74508f315b73a1a0536baa3b
MD5 7fda1d6d9dcfe71b217f649044edec3c
BLAKE2b-256 75a07b98a31a4bc0fa9a5d10f6d4f72dd42f198635eeddd96a71ce9f52c25b71

See more details on using hashes here.

File details

Details for the file dbt_col_lineage-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: dbt_col_lineage-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 133.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.9.21 Linux/6.8.0-1021-azure

File hashes

Hashes for dbt_col_lineage-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1b365b08aaba748229f1d451bb76225c112ae96f3d9c4e896274f1ad4133cf43
MD5 f4f5be4b7a29cf7be85d174086add6b8
BLAKE2b-256 6499683e776713c20eb89f50fd32e49e2ccc115da1df7f3b3b38ce0bb9c4ae57

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page