A Python library for managing, processing, and benchmarking datasets in SQLite databases for AI pipelines and LLM prompt engineering.
Project description
BenchLoop
BenchLoop is a Python library for managing, processing, and benchmarking datasets in SQLite databases—designed for AI pipelines, LLM prompt engineering, and dataset curation.
🚀 Features
- Load & Update Data: Ingest CSV, JSON, or Python dicts into SQLite with automatic table/column creation.
- Flexible Filtering: Query and filter rows with powerful, SQL-like conditions.
- Prompt Execution: Run prompts row-by-row, substitute variables, call LLMs (OpenAI), and store responses.
- Dataset Export: Export training datasets in JSONL (OpenAI "messages" or input/output format).
- Benchmarking: Compare AI responses vs. ground truth with exact and fuzzy match metrics.
- Reproducible Loops: Build scalable, iterative data workflows for AI/ML.
📦 Installation
BenchLoop is available on PyPI. Install it with:
pip install benchloop
Note: You will also need the
openaipackage for LLM prompt execution:pip install openai
🏁 Quickstart
from benchloop.loader import load_table
from benchloop.prompt_runner import execute_prompt_on_table
from benchloop.dataset_exporter import export_training_dataset
from benchloop.benchmarker import benchmark_responses
# 1. Load data
load_table(
table_name="products",
data_source=[{"id": 1, "name": "Zapato", "price": "50"}],
db_path="mydb.sqlite"
)
# 2. Run prompts and store LLM responses
execute_prompt_on_table(
table_name="products",
prompt_template="Describe the product {name} that costs {price} dollars.",
columns_variables=["name", "price"],
result_mapping={"response": "llm_response"},
db_path="mydb.sqlite",
model="gpt-4o",
api_key="sk-...",
)
# 3. Export dataset for training
export_training_dataset(
table_name="products",
prompt_template="Describe the product {name} that costs {price} dollars.",
response_column="llm_response",
output_file="dataset.jsonl",
db_path="mydb.sqlite",
format="messages"
)
# 4. Benchmark responses
benchmark_responses(
table_name="products",
column_ai="llm_response",
column_ground_truth="ground_truth",
db_path="mydb.sqlite",
benchmark_tag=None
)
📚 Documentation
- See the full documentation for API reference, advanced usage, and troubleshooting.
- Example/test script: main.py
🤝 Contributing
Contributions, issues, and feature requests are welcome!
Open an issue or submit a pull request on GitHub.
📝 License
BenchLoop makes dataset curation, prompt engineering, and benchmarking fast, reproducible, and robust for modern AI workflows.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file benchloop-0.1.1.tar.gz.
File metadata
- Download URL: benchloop-0.1.1.tar.gz
- Upload date:
- Size: 78.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
319387f9c25e4bd7fb4959a2e3eb46718766fafe3787d950a10450d34112e972
|
|
| MD5 |
67dc409ccbe96b8bdc0f8264e786eed2
|
|
| BLAKE2b-256 |
a7f7cf899df1d6711e3480a74c45be8869c2e0de5c7244f2843361c0ed2c9d9b
|
File details
Details for the file benchloop-0.1.1-py3-none-any.whl.
File metadata
- Download URL: benchloop-0.1.1-py3-none-any.whl
- Upload date:
- Size: 69.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5bc52c32b4ef01ebd3770a46e67e5a578c37ef9d179377025f8a8fd7c65cb91c
|
|
| MD5 |
16e95e031c23cd707e645a61e9230b62
|
|
| BLAKE2b-256 |
be27a14da2dad9c32f716f98f882acf7e3f3b4090840f006a83a39d339fa46a6
|