Skip to main content

A Python library for managing, processing, and benchmarking datasets in SQLite databases for AI pipelines and LLM prompt engineering.

Project description

BenchLoop

BenchLoop is a Python library for managing, processing, and benchmarking datasets in SQLite databases—designed for AI pipelines, LLM prompt engineering, and dataset curation.


🚀 Features

  • Load & Update Data: Ingest CSV, JSON, or Python dicts into SQLite with automatic table/column creation.
  • Flexible Filtering: Query and filter rows with powerful, SQL-like conditions.
  • Prompt Execution: Run prompts row-by-row, substitute variables, call LLMs (OpenAI), and store responses.
  • Dataset Export: Export training datasets in JSONL (OpenAI "messages" or input/output format).
  • Benchmarking: Compare AI responses vs. ground truth with exact and fuzzy match metrics.
  • Reproducible Loops: Build scalable, iterative data workflows for AI/ML.

📦 Installation

BenchLoop is available on PyPI. Install it with:

pip install benchloop

Note: You will also need the openai package for LLM prompt execution:

pip install openai

🏁 Quickstart

from benchloop.loader import load_table
from benchloop.prompt_runner import execute_prompt_on_table
from benchloop.dataset_exporter import export_training_dataset
from benchloop.benchmarker import benchmark_responses

# 1. Load data
load_table(
    table_name="products",
    data_source=[{"id": 1, "name": "Zapato", "price": "50"}],
    db_path="mydb.sqlite"
)

# 2. Run prompts and store LLM responses
execute_prompt_on_table(
    table_name="products",
    prompt_template="Describe the product {name} that costs {price} dollars.",
    columns_variables=["name", "price"],
    result_mapping={"response": "llm_response"},
    db_path="mydb.sqlite",
    model="gpt-4o",
    api_key="sk-...",
)

# 3. Export dataset for training
export_training_dataset(
    table_name="products",
    prompt_template="Describe the product {name} that costs {price} dollars.",
    response_column="llm_response",
    output_file="dataset.jsonl",
    db_path="mydb.sqlite",
    format="messages"
)

# 4. Benchmark responses
benchmark_responses(
    table_name="products",
    column_ai="llm_response",
    column_ground_truth="ground_truth",
    db_path="mydb.sqlite",
    benchmark_tag=None
)

📚 Documentation


🤝 Contributing

Contributions, issues, and feature requests are welcome!
Open an issue or submit a pull request on GitHub.


📝 License

MIT License


BenchLoop makes dataset curation, prompt engineering, and benchmarking fast, reproducible, and robust for modern AI workflows.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

benchloop-0.1.0.tar.gz (64.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

benchloop-0.1.0-py3-none-any.whl (26.9 kB view details)

Uploaded Python 3

File details

Details for the file benchloop-0.1.0.tar.gz.

File metadata

  • Download URL: benchloop-0.1.0.tar.gz
  • Upload date:
  • Size: 64.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for benchloop-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f1cb556ea38d26e4e52ee1478d6ca3b83160f2d00f772aa417f8fdb01a92c979
MD5 c19a39e68c4ead7e505895efaf22ec69
BLAKE2b-256 9e116c50638f5dc074e5645de1f6265480b7bf2d8613cb3890ac52bd1c7613d3

See more details on using hashes here.

File details

Details for the file benchloop-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: benchloop-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 26.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for benchloop-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5f1c9d1015f6d6384d82a6a9ee3c7a25cb51bfc783878e8819fd6550f013af50
MD5 b4e5238e031b122144e49d2dc11ba6c2
BLAKE2b-256 7fd17e4f74754cdf1ac8134aa64bfd622e3ecd8ca4e23c2cd1d1832fb46358af

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page