Personal SQL lineage generator.
Project description
SQL Lineage 🔀
Personal SQL lineage generator.
Built on the following awesome libraries:
So... what is this? 🤔
[!NOTE]
This project is still in development and currently only parses the CTEs of an SQL query since that's all I need for now.
I write a lot of SQL, and I often need to understand "lineage" in lots of different ways.
Things like dbt and SQLMesh are great for object lineage (tables, columns, etc.), but I often need to understand "lineage" like the CTE lineage in a query, among other things.
This project is just a personal tool to help me generate lineage diagrams (in Mermaid syntax) for SQL queries to fit that requirement.
Upcoming improvements will (hopefully) include:
- Semantic edges (e.g. distinguish between
JOIN
,WHERE
,UNION
, etc.) - Support for parameterised queries (e.g. with Jinja blocks, like dbt)
- Parsing only the
SELECT
part of a SQL file that includes DDL/DML commands - Lineage for multiple files in a single diagram
- Column lineage to Mermaid
Installation ⬇️
Grab a copy from PyPI like usual (note the bills-
prefix):
pip install bills-sql-lineage
Usage 📖
[!WARNING]
This is likely to change significantly as the project evolves.
Pass the path to a SQL file to the lineage
command to generate the lineage as a Mermaid diagram:
lineage path/to/file.sql
This will write a Mermaid diagram to path/to/file.mermaid
. You can control the target path with the --target
argument:
lineage path/to/file.sql --target path/to/output.mermaid
By default, the SQL dialect will be inferred by SQLGlot, but you can specify a dialect with the --dialect
argument:
lineage path/to/file.sql --dialect snowflake
Example 📝
Given the following SQL query:
with
aaa as (select 1 as aa),
bbb as (select 2 as bb),
ccc as (select 3 as cc, aa from aaa),
ddd as (select 4 as dd, aa from aaa where aa not in (select bb from bbb))
select *
from ccc
inner join ddd using (aa)
...the following Mermaid diagram will be generated:
graph TD
aaa
bbb
ccc
ddd
final
aaa --> ccc
aaa --> ddd
bbb --> ddd
ccc --> final
ddd --> final
Note that the final
node is an alias for the final SELECT
statement since the final SELECT
statement is not a CTE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bills_sql_lineage-0.0.3.tar.gz
.
File metadata
- Download URL: bills_sql_lineage-0.0.3.tar.gz
- Upload date:
- Size: 4.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2aa7844e6c089b88b4e37243d2c2c6d881356d8b92195c49812e3de1fefb2cb6 |
|
MD5 | 8add6859a0adc5e5549fb9347a5bd463 |
|
BLAKE2b-256 | afbc6868ed150a190fbcab5f82db323e18a565f42f03aca03b396fc7795786b8 |
File details
Details for the file bills_sql_lineage-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: bills_sql_lineage-0.0.3-py3-none-any.whl
- Upload date:
- Size: 5.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83b4d14aba263ab6e5f35cfe64e8ee01f33cd5e73289e41eb60987ae0d1f8f7f |
|
MD5 | fea25254e65b4e7ba8bcef748038d788 |
|
BLAKE2b-256 | 84276cedc75f091d5036113fb7fdeaadd9492319dccfec81774fa85f22a88256 |