Skip to main content

The SQLParserDataPipeline Library is a powerful Python package designed for parsing and interpreting complex SQL queries. It was developed with a focus on BigQuery but is adaptable to other SQL dialects due to its flexible parsing strategy that doesn't consider the function itself but the most inner parentheses.

Project description

SQL Query Tools Library

Overview

The SQL Query Tools Library is a powerful Python package designed for parsing and interpreting complex SQL queries. It was developed with a focus on BigQuery but is adaptable to other SQL dialects due to its flexible parsing strategy that doesn't consider the function itself but the most inner parentheses.. This Parser is specifically tuned to handle intricate query structures that go beyond the capabilities of standard SQL parsers.

Features

  • Select Clause Parsing: Handles a wide range of SELECT statements, from simple queries to those with nested statements, functions, and placeholders.
  • From Clause Analysis: Identifies table names and associated aliases in medium complexity SQL queries, suitable for LeetCode-level challenges.
  • Unnest Transformations: Extracts details from UNNEST operations, such as the type of join, aliases, and unique values, which are crucial for building data pipelines.

Capabilities

Select Function

The select function outperforms typical SQL parsers by accurately parsing column names in queries that include:

  • Nested SELECT statements
  • Functions within columns
  • Use of placeholders and complex syntax

From Function

The from function is optimized for medium complexity queries. It can accurately identify table names and their aliases within a query. Future updates aim to extend its capabilities to handle more complex scenarios.

Unnest Function

The unnest function is crucial for understanding complex joins in queries. It returns:

  • The type of join used
  • The alias of the join, for easy reference in SELECT statements
  • Unique values of columns involved in the UNNEST operation

This function is particularly useful for those developing data pipelines where understanding the flow of data transformation is critical.

Getting Started

To get started with SQL Query Tools, install the package using pip:

pip SQLParserDataPipeline

Usage

An example on how to call the functions and use the library is provided on Usage.py , feel free to use it with your queries

Queries Examples

On Example_Query.SQL a few examples were provided to demostrate how our library perform on gradually more complex queries. In each example we experiment a potential issue that other parser can't deal with.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SQLParserDataPipeline-0.5.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

SQLParserDataPipeline-0.5-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file SQLParserDataPipeline-0.5.tar.gz.

File metadata

  • Download URL: SQLParserDataPipeline-0.5.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for SQLParserDataPipeline-0.5.tar.gz
Algorithm Hash digest
SHA256 161228d6f6568021dac6c04733a4c22fd9fa11a605260166612500747cf9ef15
MD5 6274ad4dd6612ecfd490f930db1b3217
BLAKE2b-256 5ad6b8495399f92c9aaef4686b208a76ad0087b77493bca41825cc12713d82c5

See more details on using hashes here.

File details

Details for the file SQLParserDataPipeline-0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for SQLParserDataPipeline-0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 bb63398aaee9685415860d8626fea1b6731cdb93fbb4773218a38939f5a091e3
MD5 1b06c1c3be4dcba7a81c85ec3e8f4f64
BLAKE2b-256 72de2d9fbf8390032cf89b01f13b57608048bffe93d0b6f5ee22f9f11a8e4b80

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page