A Python library for intelligent SQL predicate manipulation using SQLGlot. This library provides tools to safely remove specific predicates from `SQL WHERE` clauses and filter SQL statements based on column availability.
Project description
Query.Farm SQL Manipulation
A Python library for intelligent SQL predicate manipulation using SQLGlot
Column Filtering with Complex Expressions
import sqlglot
from query_farm_sql_manipulation import transforms
sql = '''
SELECT * FROM users
WHERE age > 18
AND (status = 'active' OR role = 'admin')
AND department IN ('engineering', 'sales')
'''
# Parse the statement first
statement = sqlglot.parse_one(sql, dialect="duckdb")
# Only keep predicates involving 'age' and 'role'
allowed_columns = {'age', 'role'}
result = transforms.filter_column_references(
statement=statement,
selector=lambda col: col.name in allowed_columns,
)
# Result: SELECT * FROM users WHERE age > 18 AND role = 'admin'
print(result.sql())
Features
- Predicate Removal: Safely remove specific predicates from complex
SQL WHEREclauses while preserving logical structure - Column Filtering: Filter SQL statements to only include predicates referencing allowed columns
- Intelligent Logic Handling: Properly handles
AND/ORlogic, nested expressions,CASEstatements, and parentheses - SQLGlot Integration: Built on top of SQLGlot for robust SQL parsing and manipulation
- Multiple Dialect Support: Works with various SQL dialects (default: DuckDB)
Installation
pip install query-farm-sql-manipulation
Requirements
- Python >= 3.12
- SQLGlot >= 26.33.0
Quick Start
Basic Predicate Removal
import sqlglot
from query_farm_sql_manipulation import transforms
# Parse a SQL statement
sql = 'SELECT * FROM data WHERE x = 1 AND y = 2'
statement = sqlglot.parse_one(sql, dialect="duckdb")
# Find the predicate you want to remove
predicates = list(statement.find_all(sqlglot.expressions.Predicate))
target_predicate = predicates[0] # x = 1
# Remove the predicate
transforms.remove_expression_part(target_predicate)
# Result: SELECT * FROM data WHERE y = 2
print(statement.sql())
Column-Name Based Filtering
import sqlglot
from query_farm_sql_manipulation import transforms
# Parse SQL statement first
sql = 'SELECT * FROM data WHERE color = "red" AND size > 10 AND type = "car"'
statement = sqlglot.parse_one(sql, dialect="duckdb")
# Filter to only include predicates with allowed columns
allowed_columns = {"color", "type"}
filtered = transforms.filter_column_references(
statement=statement,
selector=lambda col: col.name in allowed_columns,
)
# Result: SELECT * FROM data WHERE color = "red" AND type = "car"
print(filtered.sql())
API Reference
remove_expression_part(child: sqlglot.Expression) -> None
Removes the specified SQLGlot expression from its parent, respecting logical structure.
Parameters:
child: The SQLGlot expression to remove
Raises:
ValueError: If the expression cannot be safely removed
Supported Parent Types:
AND/ORexpressions: Replaces parent with the remaining operandWHEREclauses: Removes the entire WHERE clause if it becomes emptyParentheses: Recursively removes the parentNOTexpressions: Removes the entire NOT expressionCASEstatements: Removes conditional branches
filter_column_references(*, statement: sqlglot.Expression, selector: Callable[[sqlglot.expressions.Column], bool]) -> sqlglot.Expression
Filters a SQL statement to remove predicates containing columns that don't match the selector criteria.
Parameters:
statement: The SQLGlot expression to filterselector: A callable that takes a Column and returns True if it should be preserved, False if it should be removed
Returns:
- Filtered SQLGlot expression with non-matching columns removed
Raises:
ValueError: If a column can't be cleanly removed due to interactions with allowed columns
where_clause_contents(statement: sqlglot.expressions.Expression) -> sqlglot.expressions.Expression | None
Extracts the contents of the WHERE clause from a SQLGlot expression.
Parameters:
statement: The SQLGlot expression to extract from
Returns:
- The contents of the WHERE clause, or None if no WHERE clause exists
filter_predicates_with_right_side_column_references(statement: sqlglot.expressions.Expression) -> sqlglot.Expression
Filters out predicates that have column references on the right side of comparisons.
Parameters:
statement: The SQLGlot expression to filter
Returns:
- Filtered SQLGlot expression with right-side column reference predicates removed
Examples
Complex Logic Handling
The library intelligently handles complex logical expressions:
# Original: (x = 1 AND y = 2) OR z = 3
# Remove y = 2: x = 1 OR z = 3
# Original: NOT (x = 1 AND y = 2)
# Remove x = 1: NOT y = 2 (which becomes y <> 2)
# Original: CASE WHEN x = 1 THEN 'yes' WHEN x = 2 THEN 'maybe' ELSE 'no' END
# Remove x = 1: CASE WHEN x = 2 THEN 'maybe' ELSE 'no' END
Column Filtering with Complex Expressions
sql = '''
SELECT * FROM users
WHERE age > 18
AND (status = 'active' OR role = 'admin')
AND department IN ('engineering', 'sales')
'''
# Only keep predicates involving 'age' and 'role'
allowed_columns = {'age', 'role'}
result = transforms.filter_column_references_statement(
sql=sql,
selector=lambda col: col.name in allowed_columns,
)
# Result: SELECT * FROM users WHERE age > 18 AND role = 'admin'
Error Handling
The library will raise ValueError when predicates cannot be safely removed:
import sqlglot
from query_farm_sql_manipulation import transforms
# This will raise ValueError because x = 1 is part of a larger expression
sql = "SELECT * FROM data WHERE result = (x = 1)"
statement = sqlglot.parse_one(sql, dialect="duckdb")
# Cannot remove x = 1 because it's used as a value, not a predicate
# This would raise ValueError if attempted
Supported SQL Constructs
- Logical Operators:
AND,OR,NOT - Comparison Operators:
=,<>,<,>,<=,>=,LIKE,IN,IS NULL, etc. - Complex Expressions:
CASEstatements, subqueries, function calls - Nested Logic: Parentheses and nested boolean expressions
- Multiple Dialects: DuckDB, PostgreSQL, MySQL, SQLite, and more via SQLGlot
Testing
Run the test suite:
pytest src/query_farm_sql_manipulation/test_transforms.py
The test suite includes comprehensive examples of:
- Basic predicate removal scenarios
- Complex logical expression handling
- Error cases and edge conditions
- Column filtering with various SQL constructs
Contributing
This project uses:
- Rye for dependency management
- pytest for testing
- mypy for type checking
- ruff for linting
Author
This Python module was created by Query.Farm.
License
MIT Licensed.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file query_farm_sql_manipulation-0.1.7.tar.gz.
File metadata
- Download URL: query_farm_sql_manipulation-0.1.7.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a31857fed4e507d94591b941bcd0b8aa397a46883d165b57f5e7577255bd1d3
|
|
| MD5 |
bccbaba254efde48cfcc30dcc7d72276
|
|
| BLAKE2b-256 |
915fd0536c2af7b6fa5ada660ff6a63e6d0c0673e63de0be021bda6f31fea025
|
File details
Details for the file query_farm_sql_manipulation-0.1.7-py3-none-any.whl.
File metadata
- Download URL: query_farm_sql_manipulation-0.1.7-py3-none-any.whl
- Upload date:
- Size: 9.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1c7046ff506ff118fa4cc86524a778976507e6d853ee146a6d784097e5620fa
|
|
| MD5 |
a6a0babef23a4be3b602099108ae6ea1
|
|
| BLAKE2b-256 |
0b98a09a5d90ea27cb7b84537cb90fdf9ae2628f78ab0b965e0c5207840656a9
|