Package to parse SQL scripts for information
Project description
SQL Express
Introduction
How often do you find yourself in the following situation:
- Having tons of SQL files to maintain
- E.g. trust & safety teams with many rules written in SQL
- E.g. data engineering teams with many ETL jobs written in SQL
- No time to read & understand the logic of every single SQL
- E.g. each SQL could be up to 1000s of lines long
- E.g. logic could be complicated & not well documented
- Struggle with data quality issues
- E.g. output has issue because source tables have issue / missing data
- Struggle with query optimization
- E.g. SQLs may contain repeated logic that can be combined to save cost & money
Wouldn't it be nice if there is a package that can:
- Summarize the logic of your SQLs
- Perform data sanity checks on the required source tables
- Identify repeated logics across your SQLs
Given a single SQL file, this package is designed to:
- Parse query structure
- Extract source tables
- < more to come >
Given multiple SQL files, this package is further designed to:
- Bulk perform all functionalities listed above
- Visualise relationship between source & output tables across SQLs
- < more to come >
Getting Started
Installation
pip install sqlexpress
Single SQL File
Using command line:
# print query structure
python3 -m sqlexpress structure -f tests/data/example1.sql
# print source tables
python3 -m sqlexpress sources -f tests/data/example1.sql
Using python:
from sqlexpress.parsers import QueryParser
query = open('tests/data/example1.sql', 'r').read()
parser = QueryParser(query)
# print query structure
parser.print()
# get source tables
source_tables = parser.extract_sources() # ['`project.dataset.raw1`', ...]
Multiple SQL Files
Firstly, create a yaml file containing details of the SQL files (example).
Using command line:
# process yaml file
python3 -m sqlexpress bulk -f tests/data/bulk1.yaml
# visualize relationships between tables
export WEBSERVER_FOLDER=tests/data # folder containing yaml files
python3 -m sqlexpress webserver # start web server
Extracted details are stored in a json file (example).
Web Server
Installation
pip install sqlexpress[web]
Home page
Shows all yaml files inside WEBSERVER_FOLDER
.
- Click on
Extract
button to process yaml file. - Click on file name to jump to project page.
Project page
Shows the following details:
- Input - Contents of yaml file
- Output - Contents of json file
- Graph - Visualisation of source & output tables
- Table: List all tables extracted from SQLs
Under The Hood
Want to identify the structure of the query first so that it is easier to identify other things (e.g. source tables). This is done in 2 steps:
With the parsed structure, we can construct a directed table graph (below) even with long, complicated queries. The source tables can be easily identified from the table graph (nodes with only outward arrows).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file sqlexpress-0.2.4.tar.gz
.
File metadata
- Download URL: sqlexpress-0.2.4.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.8.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa6bc9286d371f5c7b829c2bb8066bd7834a64217d256efe25875ad53e990e7e |
|
MD5 | 790bcdaaaba5c7f2c1d0d17fdb935ecf |
|
BLAKE2b-256 | 860b83db58a6603470154d138657a7340b8f34d31e6de6767dc8b03b5e9cb98f |
File details
Details for the file sqlexpress-0.2.4-py3-none-any.whl
.
File metadata
- Download URL: sqlexpress-0.2.4-py3-none-any.whl
- Upload date:
- Size: 19.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.8.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5edfd3cb7c3467e6cf49674d2dfc4c0eb7f1c58f1eade21b257fb3537831f412 |
|
MD5 | e22f6d671e8e0447565f078a53112d50 |
|
BLAKE2b-256 | 0a6ed3856fbf935af85652580830c0bb10e34548c9e7167f8b39c67b48c60692 |