Skip to main content

SQL Lineage Analysis Tool powered by Python

Project description

SQLLineage

SQL Lineage Analysis Tool powered by Python

image image image image Build Status Documentation Status codecov Code style: black security: bandit

Never get the hang of a SQL parser? SQLLineage comes to the rescue. Given a SQL command, SQLLineage will tell you its source and target tables, without worrying about Tokens, Keyword, Identifier and all the jagons used by SQL parsers.

Behind the scene, SQLLineage uses the fantastic sqlparse library to parse the SQL command, and bring you all the human-readable result with ease.

Demo & Documentation

Talk is cheap, show me a demo.

Documentation is online hosted by readthedocs, and you can check the release note there.

Quick Start

Install sqllineage via PyPI:

$ pip install sqllineage

Using sqllineage command to parse a quoted-query-string:

$ sqllineage -e "insert into db1.table1 select * from db2.table2"
Statements(#): 1
Source Tables:
    db2.table2
Target Tables:
    db1.table1

Or you can parse a SQL file with -f option:

$ sqllineage -f foo.sql
Statements(#): 1
Source Tables:
    db1.table_foo
    db1.table_bar
Target Tables:
    db2.table_baz

Advanced Usage

Multiple SQL Statements

Lineage result combined for multiple SQL statements, with intermediate tables identified:

$ sqllineage -e "insert into db1.table1 select * from db2.table2; insert into db3.table3 select * from db1.table1;"
Statements(#): 2
Source Tables:
    db2.table2
Target Tables:
    db3.table3
Intermediate Tables:
    db1.table1

Verbose Lineage Result

And if you want to see lineage result for every SQL statement, just toggle verbose option

$ sqllineage -v -e "insert into db1.table1 select * from db2.table2; insert into db3.table3 select * from db1.table1;"
Statement #1: insert into db1.table1 select * from db2.table2;
    table read: [Table: db2.table2]
    table write: [Table: db1.table1]
    table cte: []
    table rename: []
    table drop: []
Statement #2: insert into db3.table3 select * from db1.table1;
    table read: [Table: db1.table1]
    table write: [Table: db3.table3]
    table cte: []
    table rename: []
    table drop: []
==========
Summary:
Statements(#): 2
Source Tables:
    db2.table2
Target Tables:
    db3.table3
Intermediate Tables:
    db1.table1

Column-Level Lineage

We also support column level lineage in command line interface, set level option to column, all column lineage path will be printed.

INSERT OVERWRITE TABLE foo
SELECT a.col1,
       b.col1     AS col2,
       c.col3_sum AS col3,
       col4,
       d.*
FROM bar a
         JOIN baz b
              ON a.id = b.bar_id
         LEFT JOIN (SELECT bar_id, sum(col3) AS col3_sum
                    FROM qux
                    GROUP BY bar_id) c
                   ON a.id = sq.bar_id
         CROSS JOIN quux d;

INSERT OVERWRITE TABLE corge
SELECT a.col1,
       a.col2 + b.col2 AS col2
FROM foo a
         LEFT JOIN grault b
              ON a.col1 = b.col1;

Suppose this sql is stored in a file called foo.sql

$ sqllineage -f foo.sql -l column
<default>.corge.col1 <- <default>.foo.col1 <- <default>.bar.col1
<default>.corge.col2 <- <default>.foo.col2 <- <default>.baz.col1
<default>.corge.col2 <- <default>.grault.col2
<default>.foo.* <- <default>.quux.*
<default>.foo.col3 <- c.col3_sum <- <default>.qux.col3
<default>.foo.col4 <- col4

Lineage Visualization

One more cool feature, if you want a graph visualization for the lineage result, toggle graph-visualization option

Still using the above SQL file

sqllineage -g -f foo.sql

A webserver will be started, showing DAG representation of the lineage result in browser:

  • Table-Level Lineage
Table-Level Lineage
  • Column-Level Lineage
Column-Level Lineage

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metaphor-sqllineage-2.0.16.tar.gz (377.5 kB view details)

Uploaded Source

Built Distribution

metaphor_sqllineage-2.0.16-py3-none-any.whl (95.3 kB view details)

Uploaded Python 3

File details

Details for the file metaphor-sqllineage-2.0.16.tar.gz.

File metadata

  • Download URL: metaphor-sqllineage-2.0.16.tar.gz
  • Upload date:
  • Size: 377.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for metaphor-sqllineage-2.0.16.tar.gz
Algorithm Hash digest
SHA256 d6a7adb92eb51cd2e931e53b9cc70458283787863f6d7150891f5b1a0cf148af
MD5 15d96cc0a97f16a817df58dd4ddd6767
BLAKE2b-256 8ba1ee36d96a5605581ffba2ee902a4b03380b143c1e043e8a9b3461f1ed2bbd

See more details on using hashes here.

File details

Details for the file metaphor_sqllineage-2.0.16-py3-none-any.whl.

File metadata

File hashes

Hashes for metaphor_sqllineage-2.0.16-py3-none-any.whl
Algorithm Hash digest
SHA256 960f14b4ece94035145190516dd73777bf3251dd5986cd336b65905f86501cc5
MD5 08c1d21f8f2187db72507655252da876
BLAKE2b-256 6a3164f5b5390be3d153ed9becdae95e79facd8d8186e07a38c0fa7552f51335

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page