The SQLParserDataPipeline Library is a powerful Python package designed for parsing and interpreting complex SQL queries. It was developed with a focus on BigQuery but is adaptable to other SQL dialects due to its flexible parsing strategy that doesn't consider the function itself but the most inner parentheses.
Project description
SQL Query Tools Library
Overview
The SQL Query Tools Library is a powerful Python package designed for parsing and interpreting complex SQL queries. It was developed with a focus on BigQuery but is adaptable to other SQL dialects due to its flexible parsing strategy that doesn't consider the function itself but the most inner parentheses.. This Parser is specifically tuned to handle intricate query structures that go beyond the capabilities of standard SQL parsers.
Features
- Select Clause Parsing: Handles a wide range of
SELECT
statements, from simple queries to those with nested statements, functions, and placeholders. - From Clause Analysis: Identifies table names and associated aliases in medium complexity SQL queries, suitable for LeetCode-level challenges.
- Unnest Transformations: Extracts details from
UNNEST
operations, such as the type of join, aliases, and unique values, which are crucial for building data pipelines.
Capabilities
Select Function
The select
function outperforms typical SQL parsers by accurately parsing column names in queries that include:
- Nested
SELECT
statements - Functions within columns
- Use of placeholders and complex syntax
From Function
The from
function is optimized for medium complexity queries. It can accurately identify table names and their aliases within a query. Future updates aim to extend its capabilities to handle more complex scenarios.
Unnest Function
The unnest
function is crucial for understanding complex joins in queries. It returns:
- The type of join used
- The alias of the join, for easy reference in
SELECT
statements - Unique values of columns involved in the
UNNEST
operation
This function is particularly useful for those developing data pipelines where understanding the flow of data transformation is critical.
Getting Started
To get started with SQL Query Tools, install the package using pip:
pip SQLParserDataPipeline
Usage
An example on how to call the functions and use the library is provided on Usage.py , feel free to use it with your queries
Queries Examples
On Example_Query.SQL a few examples were provided to demostrate how our library perform on gradually more complex queries. In each example we experiment a potential issue that other parser can't deal with.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file SQLParserDataPipeline-0.3.tar.gz
.
File metadata
- Download URL: SQLParserDataPipeline-0.3.tar.gz
- Upload date:
- Size: 2.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bafbed57dca058e23d11ba4f8beba0c017442c34435b9eb15fe2e8f2a3330835 |
|
MD5 | 8654aa5c3a8b48546ffbdf9bd8f5fe86 |
|
BLAKE2b-256 | 70312dab340b22878d6587ebc7b8ff953eff569686aff1f3991fd74af0309c98 |
File details
Details for the file SQLParserDataPipeline-0.3-py3-none-any.whl
.
File metadata
- Download URL: SQLParserDataPipeline-0.3-py3-none-any.whl
- Upload date:
- Size: 2.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d6118ee07d20cf76440b3d591c436ff2926baffb544c9fd0bc380b9e12fdf5a3 |
|
MD5 | e69ebcce08b326778486fc0f27203d54 |
|
BLAKE2b-256 | e08971cb827474e9a01d4baa0674c433900ea9aab51b338e883b7c892de0be88 |