ChatWithSQL: Secure, Schema-Validated Text-to-SQL, Eliminating Arbitrary Query Risks from LLMs
Project description
ChatWithSQL
ChatWithSQL is a Python library that bridges the gap between natural language queries and SQL databases. Designed for reliability, security, and performance, ChatWithSQL allows developers to leverage advanced Language Learning Models (LLMs) like OpenAI, Gemini, and more to retrieve database data using simple, intuitive natural language prompts.
🎯 Why ChatWithSQL?
A major risk of Text-to-SQL systems is the potential execution of arbitrary SQL queries, which can result in unauthorized data access, security vulnerabilities, inefficient query performance, or incorrect query results. Common mitigations include using restricted roles, read-only databases, and sandboxed environments. However, ChatWithSQL takes this one step further.
ChatWithSQL has implemented a schema-based validation approach to ensure that only SQL queries adhering to a predefined schema are generated and executed. This mechanism restricts the scope of data retrieval strictly within the defined parameters, effectively mitigating the risks of arbitrary or malicious queries. Each query is validated against the schema before execution, guaranteeing compliance and eliminating unauthorized access.
This unique approach positions ChatWithSQL as a leader in secure and reliable, natural language-driven SQL data retrieval.
🚀 Key Features
- Natural Language to SQL: Translate human-readable prompts into actionable SQL queries.
- Schema-Validated Queries: Ensures only schema-defined queries are executed, mitigating arbitrary query risks.
- LLM Flexibility: Seamless integration with multiple LLMs (Gemini, OpenAI, Ollama, LlamaAPI).
- Secure Execution: Parameter validation and query sanitization to prevent SQL injection and unauthorized access.
- Dynamic Query Parameter Handling: Automatically extracts, validates, and maps parameters to SQL placeholders.
- Database-Agnostic: Compatible with any database supported by the URI connection format.
- Comprehensive Logging: Provides detailed logs for easier debugging and monitoring.
📦 Installation
You can install ChatWithSQL using pip:
pip install chatwithsql
🛠️ Setup and Usage
1. Initialization
from chat_with_sql import ChatWithSQL
# Configuration
database_url = "your_database_url"
llm = "openai" # Supported: gemini, openai, llama_api, ollama
model_name = "gpt-3.5-turbo"
api_key = "your_llm_api_key"
query_schema = [
{
"description": "Retrieve user data by user ID",
"name": "get_user_data",
"sql": "SELECT * FROM users WHERE id = ?",
"params": {"id": {"type": "int", "default": None}},
},
]
# Initialize ChatWithSQL
chat_with_sql = ChatWithSQL(
database_url=database_url,
llm=llm,
model=model_name,
llm_api_key=api_key,
query_schema=query_schema,
)
2. Executing Queries
Use the load_data method to process a natural language prompt and retrieve data:
prompt = "Show me the details of the user with ID 5."
response = chat_with_sql.load_data(prompt)
print(response)
🛡️ Security by Design
ChatWithSQL mitigates one of the largest risks of Text-to-SQL systems: arbitrary query execution. It employs schema-based validation to restrict query generation within pre-defined parameters. SQL queries are dynamically constructed and validated, ensuring:
- Queries are limited to schema-defined structures.
- Parameters are sanitized and validated against expected types.
- Arbitrary query execution by LLMs is entirely eliminated.
🌐 Supported LLMs
- OpenAI (e.g., GPT-4, GPT-3.5)
- Gemini
- LlamaAPI
- Ollama
🧰 API Reference
ChatWithSQL
Constructor
ChatWithSQL(
database_url: str,
llm: str,
model: Optional[str] = "",
llm_api_key: Optional[str] = None,
query_schema: Optional[List[Dict[str, Any]]] = None,
)
database_url: Connection URI for the database.llm: Type of LLM to use (gemini,openai,llama_api,ollama).model: LLM model name.llm_api_key: API key for accessing the LLM.query_schema: List of schema definitions, each withdescription,name,sql, andparams.
Method: load_data(prompt: str) -> Dict[str, Any]
Executes a natural language query and retrieves data.
prompt: Natural language request.- Returns: Query results as a dictionary.
📝 Query Schema Format
The query_schema parameter ensures secure and structured interactions. Each schema item includes:
description: Human-readable description of the query.name: Unique name of the query.sql: SQL query template with placeholders (?) for parameters.params: Dictionary defining parameters with:type: Data type (str,int,float,date,datetime).default: Specify the default value for the parameter or if it can be any value, indicate it as null.
Example:
[
{
"description": "Fetch user details by ID",
"name": "get_user_details",
"sql": "SELECT * FROM users WHERE id = ?",
"params": {"id": {"type": "int", "default": null}}
},
{
"description": "Fetch user details by DOB",
"name": "get_user_details_by_dob",
"sql": "SELECT * FROM users WHERE dob = ?",
"params": {"dob": {"type": "date", "default": null}}
}
]
🗺️ Architecture
🐛 Logging and Debugging
ChatWithSQL includes extensive logging for better observability:
- Logs parameter validation errors.
- Logs malformed prompts or unexpected results from the LLM.
- Tracks query construction and database execution.
Enable logging by configuring Python’s logging module.
🏗️ Contributing
Contributions are welcome! Please submit a pull request or open an issue on our GitHub Repository.
📜 License
ChatWithSQL is open-source software licensed under the MIT License.
🤝 Support
If you have any questions or issues, feel free to contact us at hello@bysatha.com or open a GitHub issue.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chatwithsql-0.1.3.tar.gz.
File metadata
- Download URL: chatwithsql-0.1.3.tar.gz
- Upload date:
- Size: 7.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1eac1a8ee873bfb9effc83f2cff8394804d163cdf77bd0645797dc895c1d3cd7
|
|
| MD5 |
6f3a5edbf4a85748e493ec073e971931
|
|
| BLAKE2b-256 |
ff92c36ee9ba78aea3bd1a4ad206771eeebc1c10df071e083175092fccaccfa2
|
File details
Details for the file chatwithsql-0.1.3-py3-none-any.whl.
File metadata
- Download URL: chatwithsql-0.1.3-py3-none-any.whl
- Upload date:
- Size: 8.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1362cbceff961ef95a86b18e6630f7eff8c626847863aef6a65886b4823ffa0d
|
|
| MD5 |
05ae2f041e57acf575df241a9fec5494
|
|
| BLAKE2b-256 |
350ee769d71a2664ca59881c5573d72cd952c7a732e3f03bd204831dbe98c73f
|