SQL Agent Accuracy Evaluation Framework
Project description
SQL Agent Evaluation SDK
A Python SDK for evaluating SQL Agent accuracy with comprehensive test datasets and intelligent LLM-based scoring.
中文文档 | English
Features
- 🎯 Simple Interface: Easy-to-use SQLAgent interface
- 📊 44 Test Questions: Covering 6 major SQL optimization scenarios
- 🤖 LLM Scoring: Intelligent evaluation with detailed explanations
- 📈 Auto Reports: Automatic result summary and detailed reports
- 🔧 Flexible Config: Custom database and LLM configuration
Quick Start
1. Install & Configure
pip install -r requirements.txt
Create config/.env:
DATABASE_HOST=localhost
DATABASE_PORT=3306
DATABASE_USER=root
DATABASE_PASSWORD=your_password
DATABASE_NAME=sql_eval_test
EVAL_LLM_API_KEY=your_api_key
EVAL_LLM_BASE_URL=https://api.openai.com/v1
EVAL_LLM_MODEL=gpt-4
2. Create & Evaluate Agent
import asyncio
from sqleval import SQLAgent, SQLEvaluator
class MyAgent(SQLAgent):
def optimize(self, sql_query: str) -> str:
# Your optimization logic here
return "Your optimization suggestions"
def get_name(self) -> str:
return "My Agent"
async def main():
evaluator = SQLEvaluator()
agent = MyAgent()
# Evaluate agent (automatically displays results)
dataset_results = await evaluator.evaluate(agent, datasets=['example'])
# Extract result and save report
result = dataset_results['example'][0]
result.save_report("reports/my_agent_report.md")
asyncio.run(main())
Evaluation Datasets
| Dataset | Questions | Weight | Description |
|---|---|---|---|
| index_invalidation | 10 | 20% | Index usage issues (functions, type conversion, wildcards) |
| rule_based_traps | 9 | 20% | Scenarios that trap rule-based agents |
| inefficient_join | 10 | 15% | JOIN and subquery optimization |
| execution_plan_issues | 3 | 15% | Execution plan problems (statistics, parameter sniffing) |
| access_path_issues | 6 | 15% | Data access optimization (SELECT *, filtering) |
| resource_config_issues | 6 | 15% | Memory, partitioning, and concurrency issues |
Examples
examples/quick_start.py- Minimal example to get startedexamples/custom_agent_example.py- How to create custom agentsexamples/batch_evaluation_example.py- Compare multiple agentsexamples/validate_config.py- Validate your configuration
API Reference
SQLAgent Interface
class MyAgent(SQLAgent):
def optimize(self, sql_query: str) -> str:
"""Return optimization suggestions for the SQL query"""
return "Your suggestions"
def get_name(self) -> str:
"""Return agent name"""
return "My Agent"
SQLEvaluator
evaluator = SQLEvaluator()
# Single agent evaluation
dataset_results = await evaluator.evaluate(agent, datasets=['example'])
# Batch evaluation
dataset_results = await evaluator.evaluate([agent1, agent2, agent3])
Note: The evaluate method automatically displays comparison results and saves combined reports (*_combined_report.md). For individual dataset reports, call result.save_report().
Project Structure
sqleval/
├── sqleval/ # Core SDK
│ ├── core/
│ │ ├── agent.py # SQLAgent base class
│ │ ├── evaluator.py # SQLEvaluator main class
│ │ └── result.py # EvaluationResult class
├── datasets/ # Evaluation datasets
│ ├── index_invalidation/ # Index usage scenarios
│ ├── rule_based_traps/ # Rule-based agent traps
│ ├── inefficient_join/ # JOIN optimization
│ ├── execution_plan_issues/ # Execution plan problems
│ ├── access_path_issues/ # Data access optimization
│ ├── resource_config_issues/ # Resource & config issues
│ └── meta.txt # Dataset weights
├── examples/ # Usage examples
└── reports/ # Generated reports
License
MIT License - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file david_sqleval-0.1.0.tar.gz.
File metadata
- Download URL: david_sqleval-0.1.0.tar.gz
- Upload date:
- Size: 16.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c4e6b15c5adbe2b061c3213e165069f903c50c14f7df4c817af909546ff8a8d
|
|
| MD5 |
adfc53eef7f8a53ba18e54441adf1aae
|
|
| BLAKE2b-256 |
7563e73baeddedb428054321deb80602a00478f8ae718a7838a83dd0ab2bcdae
|
Provenance
The following attestation bundles were made for david_sqleval-0.1.0.tar.gz:
Publisher:
publish.yml on davidzhangbj/sqleval
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
david_sqleval-0.1.0.tar.gz -
Subject digest:
5c4e6b15c5adbe2b061c3213e165069f903c50c14f7df4c817af909546ff8a8d - Sigstore transparency entry: 600845140
- Sigstore integration time:
-
Permalink:
davidzhangbj/sqleval@f9c264bd528319366c92c907f1857372dd51ac2c -
Branch / Tag:
refs/heads/main - Owner: https://github.com/davidzhangbj
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f9c264bd528319366c92c907f1857372dd51ac2c -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file david_sqleval-0.1.0-py3-none-any.whl.
File metadata
- Download URL: david_sqleval-0.1.0-py3-none-any.whl
- Upload date:
- Size: 20.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90e455372176a2d259777987f42722c6e913634206d86ab5526debd56322080c
|
|
| MD5 |
d62cdf9f99caf89cbfa2d8d79536a02a
|
|
| BLAKE2b-256 |
417626c68ce3ace2c61770e4a4f513242149c4c024eb9c6eb97d7c8b30734805
|
Provenance
The following attestation bundles were made for david_sqleval-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on davidzhangbj/sqleval
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
david_sqleval-0.1.0-py3-none-any.whl -
Subject digest:
90e455372176a2d259777987f42722c6e913634206d86ab5526debd56322080c - Sigstore transparency entry: 600845141
- Sigstore integration time:
-
Permalink:
davidzhangbj/sqleval@f9c264bd528319366c92c907f1857372dd51ac2c -
Branch / Tag:
refs/heads/main - Owner: https://github.com/davidzhangbj
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f9c264bd528319366c92c907f1857372dd51ac2c -
Trigger Event:
workflow_dispatch
-
Statement type: