Package to get SQL query metadata(first level lineage), column -> database.table.actual_column mapping along with table alias mapping(alias -> database.table)
Project description
SQL Metadata Lineage
This package helps you to find first level lineage / column level dependency in a given query
Get SQL metadata (first level lineage) for most type of sql queries with inner sub-queries and other complex joins
Get column level base logic after analyzing query
Sample input.sql
SELECT investments.month_nm AS month_nm,
acquisitions.companies_acquired,
investments.companies_rec_investment
FROM (
SELECT acq.acquired_month_nm AS month_nm,
COUNT(DISTINCT acq.company_permalink) AS companies_acquired
FROM tutorial.crunchbase_acquisitions acq
GROUP BY 1
) acquisitions
FULL JOIN (
SELECT invst.funded_month_nm AS month_nm,
COUNT(DISTINCT invst.company_permalink) AS companies_rec_investment
FROM tutorial.crunchbase_investments invst
GROUP BY 1
) investments
ON acquisitions.month_nm = investments.month_nm
table_map, column_map = sql_metadata_lineage.get_metadata("input.sql")
Output text
**** Database.Table alias mapping ****
Subquery mapping alias: acquisitions acq -> tutorial.crunchbase_acquisitions Subquery mapping alias: investments invst -> tutorial.crunchbase_investments
**** Column, Database and Table mapping ****
month_nm -> tutorial.crunchbase_investments.funded_month_nm companies_acquired -> count(DISTINCT tutorial.crunchbase_acquisitions.company_permalink) companies_rec_investment -> count(DISTINCT tutorial.crunchbase_investments.company_permalink)
table_map dictionary output
{'acquisitions': {'acq': 'tutorial.crunchbase_acquisitions'}, 'investments': {'invst': 'tutorial.crunchbase_investments'}}
column_map dictionary output
{'month_nm': 'tutorial.crunchbase_investments.funded_month_nm', 'companies_acquired': 'count(DISTINCT tutorial.crunchbase_acquisitions.company_permalink)', 'companies_rec_investment': 'count(DISTINCT tutorial.crunchbase_investments.company_permalink)'}
Can directly provide sql query as input
table_map, column_map = sql_metadata.get_metadata('''
SELECT investments.month_nm AS month_nm,
acquisitions.companies_acquired,
investments.companies_rec_investment
FROM (
SELECT acq.acquired_month_nm AS month_nm,
COUNT(DISTINCT acq.company_permalink) AS companies_acquired
FROM tutorial.crunchbase_acquisitions acq
GROUP BY 1
) acquisitions
FULL JOIN (
SELECT invst.funded_month_nm AS month_nm,
COUNT(DISTINCT invst.company_permalink) AS companies_rec_investment
FROM tutorial.crunchbase_investments invst
GROUP BY 1
) investments
ON acquisitions.month_nm = investments.month_nm
''')
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sql_metadata_lineage-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | d2dd6b39cad716bc74f9266c2c15db5cc3acdee2380c6493a06401c781f6c2f4 |
|
MD5 | 33fa0c07021fb67422440ea840a5d23c |
|
BLAKE2b-256 | dec422f7972a7439b8955218ad4a076fa95ee0ddda7b3865553078935f08c290 |
Hashes for sql_metadata_lineage-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1812ae773c0e41f7d14d24fa8ba2f82ff3c9b48ec6693ea51a90fe50b2bf4247 |
|
MD5 | fc92a1ee467a6ecf373d6375a61e617f |
|
BLAKE2b-256 | 39d94a1d7067035817309349374976a3d9da5eec72dd1c6944541c6534f4bcb8 |