LLM testing on steroids
Project description
RedLite
An opinionated toolset for testing Language Models for safety.
User experience
pip install redlite[all]
- Generate several runs (using Python scripting, see below)
redlite server --port <PORT>
Python API
import os
from redlite import run, load_dataset
from redlite.openai import OpenAIModel
from redlite.metric import PrefixMetric
model = OpenAIModel(api_key=os.environ["OPENAI_API_KEY"])
dataset = load_dataset("hf:innodatalabs/rt-gaia")
metric = PrefixMetric(ignore_case=True, ignore_punct=True, strip=True)
run(model=model, dataset=dataset, metric=metric)
Develop
python -m venv .venv
. .venv/bin/activate
pip install -e .[dev,all]
Make commands:
- test
- test-server
- lint
- wheen
- docs
TODO
- deps cleanup (randomname!)
- review/improve module structure
- automate CI/CD
- write docs
- publish docs automatically (CI/CD)
- web UI styling
- better test server
- tests
- Integrations (HF, OpenAI, Anthropic, vLLM)
- Fix data format in HF datasets (innodatalabs/rt-* ones) to match standard
- more robust backend API (future-proof)
- better error handling for missing deps
- document which deps we need when
- export to CSV
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
redlite-0.0.11-py3-none-any.whl
(224.4 kB
view hashes)