Skip to main content

A column lineage tool

Project description

Introduction

A Column Level Lineage Graph for SQL.

Have you ever wondered what is the column level relationship among your SQL scripts and base tables? Don't worry, this tool is intended to help you by creating an interactive graph on a webpage to explore the column level lineage among them(Currently only supports Postgres, other connection types or dialects are under development).

How to run

Here is a live demo with the mimic-iv concepts_postgres files(navigation instructions) and that is created with one line of code:

from lineagex.lineagex import lineagex
  
lineagex(sql=path/to/sql, target_schema="schema1", conn_string="postgresql://username:password@server:port/database", search_path_schema="schema1, public")

Check out more detailed usage and examples here.

What does it output

The input can be a path to a SQL file, a path to a folder containing SQL files, a list of SQLs or a list of view names and/or schemas. Optionally, you can provide less information with only the SQLs, but providing the schema information and database connection is highly recommended for the best result. The output would be a output.json and a index.html file in the folder. Start a local http server and you would be able to see the interactive graph. Check out more detailed navigation instructions here.

Why use LineageX

A general introduction of the project can be found in this blog post.

  • Automatic dependency creation: When there are dependency among the SQL files, and those tables are not yet in the database, LineageX will automatically tries to find the dependency table and creates it.
  • Clean and simple but very interactive user interface: The user interface is very simple to use with minimal clutters on the page while showing all of the necessary information.
  • Variety of SQL statements: LineageX supports a variety of SQL statements, aside from the typical SELECT statement, it also supports CREATE TABLE/VIEW [IF NOT EXISTS] statement as well as the INSERT and DELETE statement.
  • dbt support: LineageX also implemented in the dbt-LineageX, it is added into a dbt project and by using the dbt library fal, it is able to reuse the Python core and create the similar output from the dbt project.

Supported JSON format:

You can upload JSON files into the HTML produced and draw its lineage graph. Here is the supported format:

{
    table_name: {
        tables:[],
        columns:{
            column1: [[], []], // The first element is the list of columns that contribute directly to column1, 
                               // The second element is the list of columns that are referenced, such as columns from WHERE/GROUP BY
            column2: [[], []]
        },
        table_name: "",
        sql: "",
    }, 
}

As an example:

{
  table1: {
    tables: [schema1.other_table], 
    columns: {
      column1: [[schema1.other_table.columns1], [schema1.other_table.columns3]], 
      column2: [[schema1.other_table.columns2], [schema1.other_table.columns3]]
    }, 
    table_name: schema1.table1,
    sql: SELECT column1, column2 FROM schema1.other_table WHERE column3 IS NOT NULL;
  }, 
}

Supported Database Connection Types

When entering the conn_string parameter, only supported databases' connection types can be parsed successfully, or the lineage graph would be created as if no conn_string parameter is given.

Database Connection Types

  • Postgres
  • dbt-Postgres
  • Mysql
  • Sqlite
  • SQL Server
  • Oracle
  • ...

Documentation

Doc: https://sfu-db.github.io/lineagex/intro.html or just here

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lineagex-0.0.26.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

lineagex-0.0.26-py3-none-any.whl (1.3 MB view details)

Uploaded Python 3

File details

Details for the file lineagex-0.0.26.tar.gz.

File metadata

  • Download URL: lineagex-0.0.26.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.8.16 Windows/10

File hashes

Hashes for lineagex-0.0.26.tar.gz
Algorithm Hash digest
SHA256 6ed766192ce8cfef6590ea22db6c2b039cabdee95114adfa42859d44ce083ab8
MD5 26e9a46c4cacb9bf2b3152cc5125e5e4
BLAKE2b-256 d93d393c4d2796af94f1933a006662998e4d7721bab41ca8040dce93826c81af

See more details on using hashes here.

File details

Details for the file lineagex-0.0.26-py3-none-any.whl.

File metadata

  • Download URL: lineagex-0.0.26-py3-none-any.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.8.16 Windows/10

File hashes

Hashes for lineagex-0.0.26-py3-none-any.whl
Algorithm Hash digest
SHA256 8fc3cf0bce75af5b547f71932e2d68bf71d7b2fd8666a3b94a33b0c573e4ec67
MD5 52ab3db2e61032c6482a22860db03935
BLAKE2b-256 7ed3a674cab0df175fdbe5785cc569dd1a42a63c1de74b06cd9c5f049043b4b9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page