Personal SQL lineage generator.
Project description
SQL Lineage 🔀
Personal SQL lineage generator.
Built on the following awesome libraries:
So... what is this? 🤔
[!NOTE]
This project is still in development and currently only parses the CTEs of an SQL query since that's all I need for now.
I write a lot of SQL, and I often need to understand "lineage" in lots of different ways.
Things like dbt and SQLMesh are great for object lineage (tables, columns, etc.), but I often need to understand "lineage" like the CTE lineage in a query, among other things.
This project is just a personal tool to help me generate lineage diagrams (in Mermaid syntax) for SQL queries to fit that requirement.
Upcoming improvements will (hopefully) include:
- Semantic edges (e.g. distinguish between
JOIN
,WHERE
,UNION
, etc.) - Support for parameterised queries (e.g. with Jinja blocks, like dbt)
- Parsing only the
SELECT
part of a SQL file that includes DDL/DML commands - Lineage for multiple files in a single diagram
- Column lineage to Mermaid
Installation ⬇️
Grab a copy from PyPI like usual (note the bills-
prefix):
pip install bills-sql-lineage
Usage 📖
[!WARNING]
This is likely to change significantly as the project evolves.
Pass the path to a SQL file to the lineage
command to generate the lineage as a Mermaid diagram:
lineage path/to/file.sql
This will write a Mermaid diagram to path/to/file.mermaid
. You can control the target path with the --target
argument:
lineage path/to/file.sql --target path/to/output.mermaid
By default, the SQL dialect will be inferred by SQLGlot, but you can specify a dialect with the --dialect
argument:
lineage path/to/file.sql --dialect snowflake
Example 📝
Given the following SQL query:
with
aaa as (select 1 as aa),
bbb as (select 2 as bb),
ccc as (select 3 as cc, aa from aaa),
ddd as (select 4 as dd, aa from aaa where aa not in (select bb from bbb))
select *
from ccc
inner join ddd using (aa)
...the following Mermaid diagram will be generated:
graph TD
aaa
bbb
ccc
ddd
final
aaa --> ccc
aaa --> ddd
bbb --> ddd
ccc --> final
ddd --> final
Note that the final
node is an alias for the final SELECT
statement since the final SELECT
statement is not a CTE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for bills_sql_lineage-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83b4d14aba263ab6e5f35cfe64e8ee01f33cd5e73289e41eb60987ae0d1f8f7f |
|
MD5 | fea25254e65b4e7ba8bcef748038d788 |
|
BLAKE2b-256 | 84276cedc75f091d5036113fb7fdeaadd9492319dccfec81774fa85f22a88256 |