Hyperband-optimized parallelized prompt and model parameter tuning for evaluating LLMs
Project description
HyperEvals
Hyperband-optimized parallelized prompt and model parameter tuning for evaluating LLMs.
Motivation
Evaluating LLMs is both notoriously challenging and yet critical before confidently deploying in production environments. Seemingly small tweaks in prompts or upgrades to the model can have a significant impact on performance across various tasks, hence the need for carefully crafted evaluations.
HyperEvals provides hyperband-optimized parallelized prompt and model parameter tuning for evaluating LLMs, inspired by W&B's sweeps combined with hyperband optimization.
Installation
pip install hyperevals
For development installation:
git clone https://github.com/griffintarpenning/hyperevals.git
cd hyperevals
pip install -e ".[dev]"
Quick Start
# Install the package
pip install hyperevals
# Run with a configuration file
hyperevals run config.yaml
# Show version
hyperevals --version
Usage
MVP Flow
- Create a CSV dataset
- Create a prompt template
- Create an executable Model file
- Create executable scorers
- Create a config file
- Run the evaluation
- Iterate on prompt and model parameters
- Hyperband kills bad optimizations early
- Final prompt is reported w/ accuracy
Sample Configuration
dataset: /data/test.csv
prompt_template: /prompts/test.txt
model: /models/test.py
scorer: /scorers/scorer.py
max_parallelism: 2
hyperband:
min_examples: 10
bands: [10, 20, 30, 40, 50]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hyperevals-0.1.0.tar.gz.
File metadata
- Download URL: hyperevals-0.1.0.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
24be5a319700c08699e854d15b8e6fb4650c96d86e9d4c1158fd3ea082670585
|
|
| MD5 |
b5a4f523464ac9d69391f1a19e4b1d31
|
|
| BLAKE2b-256 |
df9b1dc8f1bf9c67cb1d4bb6d5e107d7326ab0f81b921f4e9ad50e6aebc3b745
|
File details
Details for the file hyperevals-0.1.0-py3-none-any.whl.
File metadata
- Download URL: hyperevals-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
efb5d52020edd566acd1e49787013dc402a8f28b74e7a5e1c624e9f4f43235ad
|
|
| MD5 |
31cb0f3dbc917b36169eeee134b9e8f5
|
|
| BLAKE2b-256 |
aadac8032ac78e3a89d878bc188dce58ea17f97e9ba04726aa4e43adfa162406
|