Skip to main content

A simple tool to extract dbt column lineage

Project description

DBT Column Lineage

Tests

Overview

DBT Column Lineage is a simple tool that helps you visualize and understand column-level data lineage in your dbt projects. It relies on dbt artifacts (manifest & catalog) and compiled sql parsing (to work as expected, it's mandatory to compile your project / run a dbt docs generate for catalog generation).

The tool offers several ways to view lineage:

  • Interactive Explorer: A local web server providing an interactive UI to explore model and column lineage visually. (Recommended)
  • DOT: Generates GraphViz dot files that can be rendered as static images.
  • Text: Simple console output showing upstream and downstream dependencies for a specific model or column.

DBT Column Lineage Demo - Concept (Note: The demo shows the lineage concept; the interactive explorer provides an enhanced UI.)

Installation

pip install dbt-col-lineage==0.2.0

Usage

First, ensure your dbt project is compiled and you have generated the catalog:

dbt compile
dbt docs generate

1. Interactive Exploration (Recommended)

To start the interactive lineage explorer:

dbt-col-lineage --explore \
    --manifest path/to/manifest.json \
    --catalog path/to/catalog.json \
    --port 8080  # Optional port selection

This will start a server (defaulting to port 8000). Open your web browser to the specified address (e.g., http://127.0.0.1:8080). You can then select models and columns from the sidebar to visualize their lineage directly in the UI.

2. Static Output (Text or DOT)

To generate lineage for a specific model or column directly in the terminal or as a DOT file:

dbt-col-lineage --select '[+]model_name[.column_name][+]' \
    --manifest path/to/manifest.json \
    --catalog path/to/catalog.json \
    --format [text|dot] \
    --output filename

Examples:

# Downstream lineage for stg_transactions.amount in text format
dbt-col-lineage --select stg_transactions.amount+ --format text

# Upstream lineage for stg_accounts.id as a DOT file
dbt-col-lineage --select +stg_accounts.id --format dot --output upstream_account_id.dot

# Both directions for stg_orders in text format
dbt-col-lineage --select stg_orders --format text

Options

  • --explore: Starts the interactive web server for exploring lineage. Cannot be used with --select or --format.
  • --select: Specify model/column for static analysis using the format [+]model_name[.column_name][+]. Cannot be used with --explore.
    • Add + suffix for downstream lineage (e.g., stg_accounts.id+)
    • Add + prefix for upstream lineage (e.g., +stg_accounts.id)
    • No + for both directions (e.g., stg_accounts.id or stg_accounts for model-level)
  • --catalog: Path to the dbt catalog file (default: target/catalog.json)
  • --manifest: Path to the dbt manifest file (default: target/manifest.json)
  • --format, -f: Output format for static analysis (text or dot). Not used with --explore. (default: text)
  • --output, -o: Output filename for dot format (without extension, default: lineage). Not used with --explore.
  • --port, -p: Port for the interactive web server when using --explore (default: 8000).

Limitations

  • Doesn't support python models
  • Some functions/syntax cannot be parsed properly, leading to models being skipped

Compatibility

The tool has been tested with the following dbt adapters:

  • Snowflake
  • SQLite
  • DuckDB

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_col_lineage-0.2.1.tar.gz (124.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_col_lineage-0.2.1-py3-none-any.whl (133.2 kB view details)

Uploaded Python 3

File details

Details for the file dbt_col_lineage-0.2.1.tar.gz.

File metadata

  • Download URL: dbt_col_lineage-0.2.1.tar.gz
  • Upload date:
  • Size: 124.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.9.21 Linux/6.8.0-1021-azure

File hashes

Hashes for dbt_col_lineage-0.2.1.tar.gz
Algorithm Hash digest
SHA256 2801d5b87bac2fb886952593b4bd5b0e6975b6a5d1db26df804ee25b73b85f3f
MD5 c1e42ba946a9d5d40532821a24d01aad
BLAKE2b-256 1777f5cfbec8b23cf76952f0c2dd687e56d46e43da7c3f226a97f73ce64e445b

See more details on using hashes here.

File details

Details for the file dbt_col_lineage-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: dbt_col_lineage-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 133.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.9.21 Linux/6.8.0-1021-azure

File hashes

Hashes for dbt_col_lineage-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 862fe297d319cfb7781f8d9dab3aef3fcf1659333354e9f28c2d854201237053
MD5 6a65ab238ccaf49f9f5c0d5f5198692f
BLAKE2b-256 24cae59061c6b34b35bd6deb2888f466cb81abed689daecebc7c38f891003d2f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page