Skip to main content

Package to get SQL query metadata(first level lineage), column -> database.table.actual_column mapping along with table alias mapping(alias -> database.table)

Project description

SQL Metadata Lineage

This package helps you to find first level lineage / column level dependency in a given query

Get SQL metadata (first level lineage) for most type of sql queries with inner sub-queries and other complex joins

Get column level base logic after analyzing query

Sample input.sql
    SELECT investments.month_nm AS month_nm,
           acquisitions.companies_acquired,
           investments.companies_rec_investment
      FROM (
            SELECT acq.acquired_month_nm AS month_nm,
                   COUNT(DISTINCT acq.company_permalink) AS companies_acquired
              FROM tutorial.crunchbase_acquisitions acq
             GROUP BY 1
           ) acquisitions

      FULL JOIN (
            SELECT invst.funded_month_nm AS month_nm,
                   COUNT(DISTINCT invst.company_permalink) AS companies_rec_investment
              FROM tutorial.crunchbase_investments invst
             GROUP BY 1
           ) investments

        ON acquisitions.month_nm = investments.month_nm

table_map, column_map = sql_metadata_lineage.get_metadata("input.sql")

Output text

**** Database.Table alias mapping ****

Subquery mapping alias: acquisitions acq -> tutorial.crunchbase_acquisitions Subquery mapping alias: investments invst -> tutorial.crunchbase_investments

**** Column, Database and Table mapping ****

month_nm -> tutorial.crunchbase_investments.funded_month_nm companies_acquired -> count(DISTINCT tutorial.crunchbase_acquisitions.company_permalink) companies_rec_investment -> count(DISTINCT tutorial.crunchbase_investments.company_permalink)

table_map dictionary output

{'acquisitions': {'acq': 'tutorial.crunchbase_acquisitions'}, 'investments': {'invst': 'tutorial.crunchbase_investments'}}

column_map dictionary output

{'month_nm': 'tutorial.crunchbase_investments.funded_month_nm', 'companies_acquired': 'count(DISTINCT tutorial.crunchbase_acquisitions.company_permalink)', 'companies_rec_investment': 'count(DISTINCT tutorial.crunchbase_investments.company_permalink)'}

Can directly provide sql query as input

table_map, column_map = sql_metadata.get_metadata('''

 SELECT investments.month_nm AS month_nm,
           acquisitions.companies_acquired,
           investments.companies_rec_investment
      FROM (
            SELECT acq.acquired_month_nm AS month_nm,
                   COUNT(DISTINCT acq.company_permalink) AS companies_acquired
              FROM tutorial.crunchbase_acquisitions acq
             GROUP BY 1
           ) acquisitions

      FULL JOIN (
            SELECT invst.funded_month_nm AS month_nm,
                   COUNT(DISTINCT invst.company_permalink) AS companies_rec_investment
              FROM tutorial.crunchbase_investments invst
             GROUP BY 1
           ) investments

        ON acquisitions.month_nm = investments.month_nm

''')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sql_metadata_lineage-0.0.1.tar.gz (4.8 kB view hashes)

Uploaded Source

Built Distribution

sql_metadata_lineage-0.0.1-py3-none-any.whl (5.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page