Natural Language to SQL — open-source library with RAG, self-correction, and federation
Project description
AaizaQL — Natural Language to SQL
Query any database in plain English. AaizaQL is an open-source Python library that converts natural language questions into SQL, executes them, and returns results with charts and insights. It fixes the key limitations of Vanna AI: better security, context memory, self-correction, and a plugin architecture.
from aaizaql import QueryEngine
engine = QueryEngine(llm="groq", database="sqlite", dsn="sqlite:///sales.db")
engine.ingest_schema()
result = engine.query("Show top 5 customers by revenue last quarter")
print(result.sql) # Generated SQL
print(result.data) # pandas DataFrame
result.chart.show() # Interactive Plotly chart
print(result.summary) # "The top customer was Acme Corp with $1.2M revenue..."
Why AaizaQL over Vanna AI?
| Feature | AaizaQL | Vanna AI |
|---|---|---|
| SQL security layer (whitelist + injection detection) | ✅ | ⚠️ Partial |
| Self-correction loop (auto-fix broken SQL) | ✅ | ⚠️ Partial |
| Context memory (multi-turn conversations) | ✅ | ⚠️ Limited |
| Enum/code mapping (always injected, no miss) | ✅ | ❌ |
| Per-user credential delegation | ✅ | ❌ (CVE-2024-5565) |
| Plugin architecture (zero core changes) | ✅ | ❌ |
| Groq support (free, fast LLM) | ✅ | ❌ |
| Local LLM via Ollama | ✅ | ✅ |
Installation
pip install aaizaql
Install with your LLM provider and database driver:
# Groq (free, fast — recommended for getting started)
pip install "aaizaql[groq]"
# Anthropic Claude
pip install "aaizaql[claude]"
# OpenAI
pip install "aaizaql[openai]"
# PostgreSQL
pip install "aaizaql[postgres]"
# Everything
pip install "aaizaql[all]"
Quick Start
1. Get a free Groq API key
Sign up at console.groq.com — it is free.
export AAIZAQL_GROQ_API_KEY="gsk_your_key_here"
2. Query your database
from aaizaql import QueryEngine
engine = QueryEngine(
llm="groq",
database="sqlite",
dsn="sqlite:///mydata.db",
)
engine.ingest_schema()
result = engine.query("How many orders were placed last month?")
print(result.sql)
print(result.data)
3. CLI usage
# Interactive REPL
aaizaql query --db sqlite:///mydata.db --llm groq
# Single question
aaizaql query --db sqlite:///mydata.db --llm groq -q "Total revenue by region"
Core Features
Multi-turn memory
engine.query("Show me the top 10 customers by revenue")
engine.query("Now filter those to only US customers") # remembers context
engine.query("Which of those signed up in 2024?") # still remembers
Train with business knowledge
# Free-text business context (retrieved via RAG)
engine.train(documentation="""
employees.status: 1=Active, 2=On Leave, 3=Resigned, 4=Terminated
Use strftime('%Y-%m', created_at) for SQLite month grouping.
business_unit_id: 4=ACCL, 8=APFIL, 12=IBOS
""")
# Enum mappings — ALWAYS injected, never missed by RAG
engine.define_enum("employees", "status", {
1: "Active", 2: "On Leave", 3: "Resigned", 4: "Terminated"
})
# Sample Q→SQL pairs for few-shot learning
engine.train(
question="Top 5 employees by total sales",
sql="SELECT e.name, SUM(s.total) FROM employees e JOIN sales s ON e.id = s.emp_id GROUP BY e.name ORDER BY 2 DESC LIMIT 5",
)
Self-correction loop
When the generated SQL fails, AaizaQL automatically sends the error back to the LLM and retries (up to 3 times by default):
attempt 1: SELECT * FROM employes → DatabaseError: no such table
attempt 2: SELECT * FROM employees → ✅ success
SQL security layer
Every SQL passes through a security gate before execution:
- Whitelist enforcement — only
SELECTandWITHare allowed - Prompt injection detection — scans user questions for manipulation attempts
- Structural parsing — uses
sqlglotto catch disguised dangerous statements - Multi-statement blocking —
SELECT 1; DROP TABLE xis rejected
Add a new database connector
from aaizaql.connectors.base import DatabaseConnector
from aaizaql.connectors import REGISTRY
class BigQueryConnector(DatabaseConnector):
name = "bigquery"
def connect(self, dsn): ...
def execute(self, sql): ...
def get_schema(self): ...
REGISTRY["bigquery"] = BigQueryConnector
Configuration
All settings can be set via environment variables (prefixed AAIZAQL_) or passed directly to QueryEngine:
| Setting | Env var | Default | Description |
|---|---|---|---|
| LLM provider | AAIZAQL_LLM_PROVIDER |
groq |
groq, claude, openai, ollama |
| Groq API key | AAIZAQL_GROQ_API_KEY |
— | Get free key at console.groq.com |
| Groq model | AAIZAQL_GROQ_MODEL |
llama3-70b-8192 |
Any Groq-supported model |
| Anthropic key | AAIZAQL_ANTHROPIC_API_KEY |
— | For llm="claude" |
| OpenAI key | AAIZAQL_OPENAI_API_KEY |
— | For llm="openai" |
| Ollama URL | AAIZAQL_OLLAMA_BASE_URL |
http://localhost:11434 |
For local models |
| Vector store | AAIZAQL_VECTOR_STORE |
chroma |
chroma or qdrant |
| Max retries | AAIZAQL_MAX_SELF_CORRECTION_RETRIES |
3 |
Self-correction attempts |
| Session history | AAIZAQL_SESSION_HISTORY_LIMIT |
10 |
Turns kept in context |
Supported Databases
| Database | Connector name | Install |
|---|---|---|
| SQLite | sqlite |
Built-in |
| PostgreSQL | postgresql / postgres |
pip install "aaizaql[postgres]" |
| MySQL | mysql |
pip install pymysql |
| Snowflake | snowflake |
pip install "aaizaql[snowflake]" |
| DuckDB | duckdb |
pip install "aaizaql[duckdb]" |
Supported LLM Providers
| Provider | Key | Notes |
|---|---|---|
| Groq | groq |
Free tier available. Fastest inference. Recommended. |
| Anthropic Claude | claude |
Best accuracy on complex schemas. |
| OpenAI | openai |
GPT-4o and others. |
| Ollama | ollama |
Local, private, no API key. |
Roadmap
- Phase 1: Core library (RAG, self-correction, security, memory, connectors)
- Phase 2: SaaS web UI (FastAPI + Next.js)
- Phase 3: Federated cross-database queries (DuckDB workspace)
- Phase 4: Enterprise (SSO, RBAC, audit log, SOC2)
Contributing
Contributions are welcome. See CONTRIBUTING.md for guidelines.
git clone https://github.com/ibrahimkhalilCorp/aaizaql
cd aaizaql
pip install -e ".[dev]"
pytest tests/
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aaizaql-0.1.2.tar.gz.
File metadata
- Download URL: aaizaql-0.1.2.tar.gz
- Upload date:
- Size: 66.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b2cf158fb62e74005f62e47139a94936042ea1602e7e39a75cc05391eb4bd28
|
|
| MD5 |
e881adb69a289d156237a0acca08e629
|
|
| BLAKE2b-256 |
0071ee39cf2ecd9e073a846389de881fe6563ca0a534148b49f1f16b0d77df02
|
File details
Details for the file aaizaql-0.1.2-py3-none-any.whl.
File metadata
- Download URL: aaizaql-0.1.2-py3-none-any.whl
- Upload date:
- Size: 50.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26bd0598447628bbdbd87923d32b157bb50356e27437ebb0b23a80a71b7aad11
|
|
| MD5 |
c248538841841a2710d52105d6b8a2a3
|
|
| BLAKE2b-256 |
a5d2b1c82b0af845bd21116ca095d8b3292e91cfe3560c219cc6b7af3fd2ca56
|