Skip to main content

Personal SQL lineage generator.

Project description

Python uv tests coverage GitHub last commit

code style: prettier Ruff pre-commit.ci status


SQL Lineage 🔀

Personal SQL lineage generator.

Built on the following awesome libraries:

So... what is this? 🤔

[!NOTE]

This project is still in development and currently only parses the CTEs of an SQL query since that's all I need for now.

I write a lot of SQL, and I often need to understand "lineage" in lots of different ways.

Things like dbt and SQLMesh are great for object lineage (tables, columns, etc.), but I often need to understand "lineage" like the CTE lineage in a query, among other things.

This project is just a personal tool to help me generate lineage diagrams (in Mermaid syntax) for SQL queries to fit that requirement.

Upcoming improvements will (hopefully) include:

  • Semantic edges (e.g. distinguish between JOIN, WHERE, UNION, etc.)
  • Support for parameterised queries (e.g. with Jinja blocks, like dbt)
  • Parsing only the SELECT part of a SQL file that includes DDL/DML commands
  • Lineage for multiple files in a single diagram
  • Column lineage to Mermaid

Installation ⬇️

Grab a copy from PyPI like usual (note the bills- prefix):

pip install bills-sql-lineage

Usage 📖

[!WARNING]

This is likely to change significantly as the project evolves.

Pass the path to a SQL file to the lineage command to generate the lineage as a Mermaid diagram:

lineage path/to/file.sql

This will write a Mermaid diagram to path/to/file.mermaid. You can control the target path with the --target argument:

lineage path/to/file.sql --target path/to/output.mermaid

By default, the SQL dialect will be inferred by SQLGlot, but you can specify a dialect with the --dialect argument:

lineage path/to/file.sql --dialect snowflake

Example 📝

Given the following SQL query:

with

aaa as (select 1 as aa),
bbb as (select 2 as bb),
ccc as (select 3 as cc, aa from aaa),
ddd as (select 4 as dd, aa from aaa where aa not in (select bb from bbb))

select *
from ccc
    inner join ddd using (aa)

...the following Mermaid diagram will be generated:

graph TD
    aaa
    bbb
    ccc
    ddd
    final
    aaa --> ccc
    aaa --> ddd
    bbb --> ddd
    ccc --> final
    ddd --> final

Note that the final node is an alias for the final SELECT statement since the final SELECT statement is not a CTE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bills_sql_lineage-0.0.5.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bills_sql_lineage-0.0.5-py3-none-any.whl (5.2 kB view details)

Uploaded Python 3

File details

Details for the file bills_sql_lineage-0.0.5.tar.gz.

File metadata

  • Download URL: bills_sql_lineage-0.0.5.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.7.13

File hashes

Hashes for bills_sql_lineage-0.0.5.tar.gz
Algorithm Hash digest
SHA256 218340ceb3ab85dd46ee236042b91cbae6863e14247cd33daf417ce6a5273f24
MD5 f3e3365efd544938fb9747f542570fb0
BLAKE2b-256 9870685a781c424ab8870311a52002e6692cec66ed957cac8e540459958563ab

See more details on using hashes here.

File details

Details for the file bills_sql_lineage-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for bills_sql_lineage-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0db643d4d24f281f7929b9ae54b1f5c8451bab7255abdd44b05614c4984815b5
MD5 a2088ba9d62583b9effd223db7c69bb5
BLAKE2b-256 b0864c3ea59e9216a798bef1b1414c22a1b443d79df4f734150a3979991cd380

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page