ANTLR4-based SQL rewriting tool for permission management in LLM-generated SQL
Project description
SQL Rewriter - ANTLR4-based SQL Rewriting Tool
A Python package for SQL rewriting based on ANTLR4 grammar parsing. Currently provides the add_where_condition function, designed for permission management in LLM-generated SQL. Uses syntax parsing instead of regex matching, making it easier to handle various SQL formats generated by large language models.
How It Works
Uses ANTLR4 to parse SQL statements into syntax trees, then traverses the tree using the Visitor pattern to locate target table queries and intelligently add or merge WHERE conditions. If the original SQL already has a WHERE clause, it wraps the existing condition in parentheses before adding the new permission condition with AND (to prevent LLM SQL injection).
Installation
pip install sql-rewriter
Or install from source (if you want to modify the code):
git clone https://github.com/wangyang377/sql-rewriter.git
cd sql-rewriter
./scripts/generate_parser.sh # Requires ANTLR4, see Development section below
pip install -e .
Usage
Basic Usage
from sql_rewriter import add_where_condition
# Add condition when there's no WHERE clause
sql = "SELECT * FROM users;"
new_sql = add_where_condition(sql, "age > 18", "users")
# Result: SELECT * FROM users WHERE age > 18;
# Append condition when WHERE clause already exists (wraps existing condition in parentheses)
sql = "SELECT * FROM users WHERE age > 18;"
new_sql = add_where_condition(sql, "status = 'active'", "users")
# Result: SELECT * FROM users WHERE (age > 18) AND status = 'active';
# JOIN queries - add condition only for specific table
sql = "SELECT * FROM users JOIN orders ON users.id = orders.user_id;"
new_sql = add_where_condition(sql, "users.status = 'active'", "users")
# Result: SELECT * FROM users JOIN orders ON users.id = orders.user_id WHERE users.status = 'active';
# Nested queries work too - precisely locates target table
sql = "SELECT * FROM orders WHERE status = 'pending' AND EXISTS (SELECT 1 FROM users WHERE users.id = orders.user_id);"
new_sql = add_where_condition(sql, "users.status = 'active'", "users")
# Result: SELECT * FROM orders WHERE status = 'pending' AND EXISTS (SELECT 1 FROM users WHERE users.id = orders.user_id AND users.status = 'active');
API Reference
add_where_condition(sql_text, new_condition, table_name=None)
Parameters:
sql_text: Original SQL statementnew_condition: WHERE condition to add (without the WHERE keyword)table_name: Target table name. Condition is only added if the FROM clause contains this table. IfNone, no processing is performed
Returns:
- Modified SQL statement (string)
Raises:
ValueError: If SQL parsing fails
Development
If you clone the project from Git, you need to generate ANTLR parser code first:
# Install ANTLR4 (macOS)
brew install antlr
# Linux (Ubuntu/Debian)
sudo apt-get install antlr4
# Then generate code
./scripts/generate_parser.sh
Run tests:
cd tests
python test_parser.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sql_rewriter-0.1.2.tar.gz.
File metadata
- Download URL: sql_rewriter-0.1.2.tar.gz
- Upload date:
- Size: 220.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83dae4a89246d02314b0e958bae759476e1a1fcb9609e791a95f12b069978f6f
|
|
| MD5 |
909e1e337a06a490c2a5b8d2a5557776
|
|
| BLAKE2b-256 |
10e626923ac2c5771a20a89086707ca68f5736d11e5bc39e65f329a0b36a4d44
|
File details
Details for the file sql_rewriter-0.1.2-py3-none-any.whl.
File metadata
- Download URL: sql_rewriter-0.1.2-py3-none-any.whl
- Upload date:
- Size: 222.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5257bf99f5ea3ef35ac8c8c18330b79a85344448265e9359007869425b07481
|
|
| MD5 |
7331234ed2cfb54fc579a417ec4574d5
|
|
| BLAKE2b-256 |
e92c12e3481db230f563997eff83125180f995efaac947d02599167359f4938d
|