Skip to main content

Package to get SQL query metadata(first level lineage), column -> database.table.actual_column mapping along with table alias mapping(alias -> database.table)

Project description

SQL Metadata Lineage

This package helps you to find first level lineage / column level dependency in a given query

Get SQL metadata (first level lineage) for most type of sql queries with inner sub-queries and other complex joins

Get column level base logic after analyzing query

Sample input.sql
    SELECT investments.month_nm AS month_nm,
           acquisitions.companies_acquired,
           investments.companies_rec_investment
      FROM (
            SELECT acq.acquired_month_nm AS month_nm,
                   COUNT(DISTINCT acq.company_permalink) AS companies_acquired
              FROM tutorial.crunchbase_acquisitions acq
             GROUP BY 1
           ) acquisitions

      FULL JOIN (
            SELECT invst.funded_month_nm AS month_nm,
                   COUNT(DISTINCT invst.company_permalink) AS companies_rec_investment
              FROM tutorial.crunchbase_investments invst
             GROUP BY 1
           ) investments

        ON acquisitions.month_nm = investments.month_nm

table_map, column_map = sql_metadata_lineage.get_metadata("input.sql")

Output text

**** Database.Table alias mapping ****

Subquery mapping alias: acquisitions acq -> tutorial.crunchbase_acquisitions Subquery mapping alias: investments invst -> tutorial.crunchbase_investments

**** Column, Database and Table mapping ****

month_nm -> tutorial.crunchbase_investments.funded_month_nm companies_acquired -> count(DISTINCT tutorial.crunchbase_acquisitions.company_permalink) companies_rec_investment -> count(DISTINCT tutorial.crunchbase_investments.company_permalink)

table_map dictionary output

{'acquisitions': {'acq': 'tutorial.crunchbase_acquisitions'}, 'investments': {'invst': 'tutorial.crunchbase_investments'}}

column_map dictionary output

{'month_nm': 'tutorial.crunchbase_investments.funded_month_nm', 'companies_acquired': 'count(DISTINCT tutorial.crunchbase_acquisitions.company_permalink)', 'companies_rec_investment': 'count(DISTINCT tutorial.crunchbase_investments.company_permalink)'}

Can directly provide sql query as input

table_map, column_map = sql_metadata.get_metadata('''

 SELECT investments.month_nm AS month_nm,
           acquisitions.companies_acquired,
           investments.companies_rec_investment
      FROM (
            SELECT acq.acquired_month_nm AS month_nm,
                   COUNT(DISTINCT acq.company_permalink) AS companies_acquired
              FROM tutorial.crunchbase_acquisitions acq
             GROUP BY 1
           ) acquisitions

      FULL JOIN (
            SELECT invst.funded_month_nm AS month_nm,
                   COUNT(DISTINCT invst.company_permalink) AS companies_rec_investment
              FROM tutorial.crunchbase_investments invst
             GROUP BY 1
           ) investments

        ON acquisitions.month_nm = investments.month_nm

''')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sql_metadata_lineage-0.0.1.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

sql_metadata_lineage-0.0.1-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file sql_metadata_lineage-0.0.1.tar.gz.

File metadata

  • Download URL: sql_metadata_lineage-0.0.1.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for sql_metadata_lineage-0.0.1.tar.gz
Algorithm Hash digest
SHA256 d2dd6b39cad716bc74f9266c2c15db5cc3acdee2380c6493a06401c781f6c2f4
MD5 33fa0c07021fb67422440ea840a5d23c
BLAKE2b-256 dec422f7972a7439b8955218ad4a076fa95ee0ddda7b3865553078935f08c290

See more details on using hashes here.

File details

Details for the file sql_metadata_lineage-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: sql_metadata_lineage-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 5.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for sql_metadata_lineage-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1812ae773c0e41f7d14d24fa8ba2f82ff3c9b48ec6693ea51a90fe50b2bf4247
MD5 fc92a1ee467a6ecf373d6375a61e617f
BLAKE2b-256 39d94a1d7067035817309349374976a3d9da5eec72dd1c6944541c6534f4bcb8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page