Skip to main content

A simple tool to extract dbt column lineage

Project description

DBT Column Lineage

Tests

Overview

DBT Column Lineage is a simple tool that helps you visualize and understand column-level data lineage in your dbt projects. It relies on dbt artifacts (manifest & catalog) and compiled sql parsing (to work as expected, it's mandatory to compile your project / run a dbt docs generate for catalog generation).

The tool offers several ways to view lineage:

  • Interactive Explorer: A local web server providing an interactive UI to explore model and column lineage visually. (Recommended)
  • DOT: Generates GraphViz dot files that can be rendered as static images.
  • Text: Simple console output showing upstream and downstream dependencies for a specific model or column.

DBT Column Lineage Demo - Concept (Note: The demo shows the lineage concept; the interactive explorer provides an enhanced UI.)

Installation

pip install dbt-col-lineage==0.2.1

Usage

First, ensure your dbt project is compiled and you have generated the catalog:

dbt compile
dbt docs generate

1. Interactive Exploration (Recommended)

To start the interactive lineage explorer:

dbt-col-lineage --explore \
    --manifest path/to/manifest.json \
    --catalog path/to/catalog.json \
    --port 8080  # Optional port selection

This will start a server (defaulting to port 8000). Open your web browser to the specified address (e.g., http://127.0.0.1:8080). You can then select models and columns from the sidebar to visualize their lineage directly in the UI.

2. Static Output (Text or DOT)

To generate lineage for a specific model or column directly in the terminal or as a DOT file:

dbt-col-lineage --select '[+]model_name[.column_name][+]' \
    --manifest path/to/manifest.json \
    --catalog path/to/catalog.json \
    --format [text|dot] \
    --output filename

Examples:

# Downstream lineage for stg_transactions.amount in text format
dbt-col-lineage --select stg_transactions.amount+ --format text

# Upstream lineage for stg_accounts.id as a DOT file
dbt-col-lineage --select +stg_accounts.id --format dot --output upstream_account_id.dot

# Both directions for stg_orders in text format
dbt-col-lineage --select stg_orders --format text

Options

  • --explore: Starts the interactive web server for exploring lineage. Cannot be used with --select or --format.
  • --select: Specify model/column for static analysis using the format [+]model_name[.column_name][+]. Cannot be used with --explore.
    • Add + suffix for downstream lineage (e.g., stg_accounts.id+)
    • Add + prefix for upstream lineage (e.g., +stg_accounts.id)
    • No + for both directions (e.g., stg_accounts.id or stg_accounts for model-level)
  • --catalog: Path to the dbt catalog file (default: target/catalog.json)
  • --manifest: Path to the dbt manifest file (default: target/manifest.json)
  • --format, -f: Output format for static analysis (text or dot). Not used with --explore. (default: text)
  • --output, -o: Output filename for dot format (without extension, default: lineage). Not used with --explore.
  • --port, -p: Port for the interactive web server when using --explore (default: 8000).

Limitations

  • Doesn't support python models
  • Some functions/syntax cannot be parsed properly, leading to models being skipped

Compatibility

The tool has been tested with the following dbt adapters:

  • Snowflake
  • SQLite
  • DuckDB

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_col_lineage-0.3.0.tar.gz (124.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_col_lineage-0.3.0-py3-none-any.whl (133.6 kB view details)

Uploaded Python 3

File details

Details for the file dbt_col_lineage-0.3.0.tar.gz.

File metadata

  • Download URL: dbt_col_lineage-0.3.0.tar.gz
  • Upload date:
  • Size: 124.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.9.21 Linux/6.8.0-1021-azure

File hashes

Hashes for dbt_col_lineage-0.3.0.tar.gz
Algorithm Hash digest
SHA256 57db1c01a50470eadb011bad80d82c46f02ed47d84ba3e5b14b1d27f2c2b3092
MD5 fdafaca002c54d563fc55b58074e021a
BLAKE2b-256 06da4d1e4c16ddd67aae2ef02b5e3e14013262417cd25b054ddd2ec128eb4342

See more details on using hashes here.

File details

Details for the file dbt_col_lineage-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: dbt_col_lineage-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 133.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.9.21 Linux/6.8.0-1021-azure

File hashes

Hashes for dbt_col_lineage-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 554df696f84259546f7c92e33aa82a2b7cbc03792a90ba04194c8e378b0a4143
MD5 8ca605d1c9269b536463a798e07c9256
BLAKE2b-256 9487ece0999e9712f28b5bb66599f02aac49373341adf8c986f09da2b1ffff0d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page