A Text2SQL benchmark for evaluation of Large Language Models
Project description
LLMSQL
Patched and improved version of the original large crowd-sourced dataset for developing natural language interfaces for relational databases, WikiSQL.
Our datasets are available for different scenarios on our HuggingFace page.
Overview
Install
pip3 install llmsql
This repository provides the LLMSQL Benchmark — a modernized, cleaned, and extended version of WikiSQL, designed for evaluating and fine-tuning large language models (LLMs) on Text-to-SQL tasks.
Note
The package doesn't have the dataset, it is stored on our HuggingFace page.
This package contains
- Support for modern LLMs.
- Tools for evaluation, inference, and finetuning.
- Support for Hugging Face models out-of-the-box.
- Structured for reproducibility and benchmarking.
Usage Recommendations
Modern LLMs are already strong at producing SQL queries without finetuning.
We therefore recommend that most users:
-
Run inference directly on the full benchmark:
- Use
llmsql.LLMSQLVLLMInference(the main inference class) for generation of SQL predictions with your LLM from HF. - Evaluate results against the benchmark with the
llmsql.LLMSQLEvaluatorevaluator class.
- Use
-
Optional finetuning:
- For research or domain adaptation, we provide finetuning script for HF models. Use
llmsql finetune --helpor read Finetune Readme to find more about finetuning.
- For research or domain adaptation, we provide finetuning script for HF models. Use
[!Tip] You can find additional manuals in the README files of each folder(Inferece Readme, Evaluation Readme, Finetune Readme)
Repository Structure
llmsql/
├── evaluation/ # Scripts for downloading DB + evaluating predictions
├── inference/ # Generate SQL queries with your LLM
└── finetune/ # Fine-tuning with TRL's SFTTrainer
Quickstart
Install
Make sure you have the package installed (we used python3.11):
pip3 install llmsql
1. Run Inference
from llmsql import LLMSQLVLLMInference
# Initialize inference engine
inference = LLMSQLVLLMInference(
model_name="Qwen/Qwen2.5-1.5B-Instruct", # or any Hugging Face causal LM
tensor_parallel_size=1,
)
# Run generation
results = inference.generate(
output_file="path_to_your_outputs.jsonl",
questions_path="data/questions.jsonl",
tables_path="data/tables.jsonl",
shots=5,
batch_size=8,
max_new_tokens=256,
temperature=0.7,
)
2. Evaluate Results
from llmsql import LLMSQLEvaluator
evaluator = LLMSQLEvaluator(workdir_path="llmsql_workdir")
report = evaluator.evaluate(outputs_path="path_to_your_outputs.jsonl")
print(report)
Finetuning (Optional)
If you want to adapt a base model on LLMSQL:
llmsql finetune --config_file examples/example_finetune_args.yaml
This will train a model on the train/val splits with the parameters provided in the config file. You can find example config file here.
Suggested Workflow
- Primary: Run inference on
dataset/questions.jsonl→ Evaluate withevaluation/. - Secondary (optional): Fine-tune on
train/val→ Test ontest_questions.jsonl.
License & Citation
Please cite LLMSQL if you use it in your work:
@inproceedings{llmsql_bench,
title={LLMSQL: Upgrading WikiSQL for the LLM Era of Text-to-SQL},
author={Pihulski, Dzmitry and Charchut, Karol and Novogrodskaia, Viktoria and Koco{'n}, Jan},
booktitle={2025 IEEE International Conference on Data Mining Workshops (ICDMW)},
year={2025},
organization={IEEE}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmsql-0.1.3.tar.gz.
File metadata
- Download URL: llmsql-0.1.3.tar.gz
- Upload date:
- Size: 19.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05bbbfc57299002345c650d652d5407717bb27cda6c7ad6dbaa315b963392d84
|
|
| MD5 |
d2936495ee7fe7a5d5ab264d48dd9a20
|
|
| BLAKE2b-256 |
29b7628410e7e7c2e514d4b279ddd33c43e2b4fbb0b5096d0714e8345c7e7cb9
|
Provenance
The following attestation bundles were made for llmsql-0.1.3.tar.gz:
Publisher:
publish.yml on LLMSQL/llmsql-benchmark
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llmsql-0.1.3.tar.gz -
Subject digest:
05bbbfc57299002345c650d652d5407717bb27cda6c7ad6dbaa315b963392d84 - Sigstore transparency entry: 560612850
- Sigstore integration time:
-
Permalink:
LLMSQL/llmsql-benchmark@852fdacd5e1aa0e9afd9437a91cc170c9ce0c43b -
Branch / Tag:
refs/heads/main - Owner: https://github.com/LLMSQL
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@852fdacd5e1aa0e9afd9437a91cc170c9ce0c43b -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file llmsql-0.1.3-py3-none-any.whl.
File metadata
- Download URL: llmsql-0.1.3-py3-none-any.whl
- Upload date:
- Size: 24.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
493455094ce14da1f8dd9e5ccae8dd9f9d74b5cd18f9565413d5d50f192d351d
|
|
| MD5 |
92b8ac1fef90abe8e12f8611f3baa565
|
|
| BLAKE2b-256 |
14fd204aff9dbe51e28df82215909bf96c69ba41f0ced98932cda6fa41268bad
|
Provenance
The following attestation bundles were made for llmsql-0.1.3-py3-none-any.whl:
Publisher:
publish.yml on LLMSQL/llmsql-benchmark
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llmsql-0.1.3-py3-none-any.whl -
Subject digest:
493455094ce14da1f8dd9e5ccae8dd9f9d74b5cd18f9565413d5d50f192d351d - Sigstore transparency entry: 560612860
- Sigstore integration time:
-
Permalink:
LLMSQL/llmsql-benchmark@852fdacd5e1aa0e9afd9437a91cc170c9ce0c43b -
Branch / Tag:
refs/heads/main - Owner: https://github.com/LLMSQL
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@852fdacd5e1aa0e9afd9437a91cc170c9ce0c43b -
Trigger Event:
workflow_dispatch
-
Statement type: