Skip to main content

LLM testing on steroids

Project description

RedLite

PyPI version

An opinionated toolset for testing Conversational Language Models.

Docs

Usage

  1. Install required dependencies

    pip install redlite[all]
    
  2. Generate several runs (using Python scripting, see examples, and below)

  3. Review and compare runs

    redlite server --port <PORT>
    

Python API

import os
from redlite import run, load_dataset
from redlite.openai import OpenAIModel
from redlite.metric import PrefixMetric


model = OpenAIModel(api_key=os.environ["OPENAI_API_KEY"])
dataset = load_dataset("hf:innodatalabs/rt-gaia")
metric = PrefixMetric(ignore_case=True, ignore_punct=True, strip=True)

run(model=model, dataset=dataset, metric=metric)

Goals

  • simple, easy-to-learn API
  • lightweight
  • only necessary dependencies
  • framework-agnostic (PyTorch, Tensorflow, Keras, Flax, Jax)
  • basic analytic tools included

Develop

python -m venv .venv
. .venv/bin/activate
pip install -e .[dev,all]

Make commands:

  • test
  • test-server
  • lint
  • wheel
  • docs
  • black

TODO

  • deps cleanup (randomname!)
  • review/improve module structure
  • automate CI/CD
  • write docs
  • publish docs automatically (CI/CD)
  • web UI styling
  • better test server
  • tests
  • Integrations (HF, OpenAI, Anthropic, vLLM)
  • Fix data format in HF datasets (innodatalabs/rt-* ones) to match standard
  • more robust backend API (future-proof)
  • better error handling for missing deps
  • document which deps we need when
  • export to CSV
  • Upload to Zeno

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

redlite-0.0.23-py3-none-any.whl (456.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page