Skip to main content

A Python library for managing, processing, and benchmarking datasets in SQLite databases for AI pipelines and LLM prompt engineering.

Project description

BenchLoop

BenchLoop is a Python library for managing, processing, and benchmarking datasets in SQLite databases—designed for AI pipelines, LLM prompt engineering, and dataset curation.


🚀 Features

  • Load & Update Data: Ingest CSV, JSON, or Python dicts into SQLite with automatic table/column creation.
  • Flexible Filtering: Query and filter rows with powerful, SQL-like conditions.
  • Prompt Execution: Run prompts row-by-row, substitute variables, call LLMs (OpenAI), and store responses.
  • Dataset Export: Export training datasets in JSONL (OpenAI "messages" or input/output format).
  • Benchmarking: Compare AI responses vs. ground truth with exact and fuzzy match metrics.
  • Reproducible Loops: Build scalable, iterative data workflows for AI/ML.

📦 Installation

BenchLoop is available on PyPI. Install it with:

pip install benchloop

Note: You will also need the openai package for LLM prompt execution:

pip install openai

🏁 Quickstart

from benchloop.loader import load_table
from benchloop.prompt_runner import execute_prompt_on_table
from benchloop.dataset_exporter import export_training_dataset
from benchloop.benchmarker import benchmark_responses

# 1. Load data
load_table(
    table_name="products",
    data_source=[{"id": 1, "name": "Zapato", "price": "50"}],
    db_path="mydb.sqlite"
)

# 2. Run prompts and store LLM responses
execute_prompt_on_table(
    table_name="products",
    prompt_template="Describe the product {name} that costs {price} dollars.",
    columns_variables=["name", "price"],
    result_mapping={"response": "llm_response"},
    db_path="mydb.sqlite",
    model="gpt-4o",
    api_key="sk-...",
)

# 3. Export dataset for training
export_training_dataset(
    table_name="products",
    prompt_template="Describe the product {name} that costs {price} dollars.",
    response_column="llm_response",
    output_file="dataset.jsonl",
    db_path="mydb.sqlite",
    format="messages"
)

# 4. Benchmark responses
benchmark_responses(
    table_name="products",
    column_ai="llm_response",
    column_ground_truth="ground_truth",
    db_path="mydb.sqlite",
    benchmark_tag=None
)

📚 Documentation


🤝 Contributing

Contributions, issues, and feature requests are welcome!
Open an issue or submit a pull request on GitHub.


📝 License

MIT License


BenchLoop makes dataset curation, prompt engineering, and benchmarking fast, reproducible, and robust for modern AI workflows.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

benchloop-0.1.1.tar.gz (78.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

benchloop-0.1.1-py3-none-any.whl (69.6 kB view details)

Uploaded Python 3

File details

Details for the file benchloop-0.1.1.tar.gz.

File metadata

  • Download URL: benchloop-0.1.1.tar.gz
  • Upload date:
  • Size: 78.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for benchloop-0.1.1.tar.gz
Algorithm Hash digest
SHA256 319387f9c25e4bd7fb4959a2e3eb46718766fafe3787d950a10450d34112e972
MD5 67dc409ccbe96b8bdc0f8264e786eed2
BLAKE2b-256 a7f7cf899df1d6711e3480a74c45be8869c2e0de5c7244f2843361c0ed2c9d9b

See more details on using hashes here.

File details

Details for the file benchloop-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: benchloop-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 69.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for benchloop-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5bc52c32b4ef01ebd3770a46e67e5a578c37ef9d179377025f8a8fd7c65cb91c
MD5 16e95e031c23cd707e645a61e9230b62
BLAKE2b-256 be27a14da2dad9c32f716f98f882acf7e3f3b4090840f006a83a39d339fa46a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page