Skip to main content

No project description provided

Project description

AutoArena

AutoArena helps you stack rank LLM outputs against one another using automated judge evaluation.

Install from PyPI and run with:

pip install autoarena
python -m autoarena

Usage

Getting started with AutoArena is simple:

  1. Run AutoArena via python -m autoarena and visit localhost:8899 in your browser.
  2. Create a project via the UI.
  3. Add responses from a model by selecting a CSV file with prompt and response columns.
  4. Configure an automated judge via the UI. Note that most judges require credentials, e.g. X_API_KEY in the environment where you're running AutoArena.
  5. Add responses from a second model to kick off an automated judging task using the judges you configured in the previous step to decide which of the models you've uploaded provided a better response to a given prompt.

That's it! After these steps you're fully set up for automated evaluation on AutoArena.

Data Storage

Data is stored in ./data/<project>.duckdb files in the directory where you invoked AutoArena. See data/README.md for more details on data storage in AutoArena.

Development

AutoArena uses uv to manage dependencies. To set up this repository for development, run:

uv venv && source .venv/bin/activate
uv pip install --all-extras -r pyproject.toml
uv tool run pre-commit install
uv run python3 -m autoarena --dev

To run AutoArena for development, you will need to run both the backend and frontend service:

  • Backend: uv run python3 -m autoarena --dev (the --dev/-d flag enables automatic service reloading when source files change)
  • Frontend: see ui/README.md

To build a release tarball in the ./dist directory:

./scripts/build.sh

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoarena-0.1.0b5.tar.gz (1.2 MB view details)

Uploaded Source

File details

Details for the file autoarena-0.1.0b5.tar.gz.

File metadata

  • Download URL: autoarena-0.1.0b5.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for autoarena-0.1.0b5.tar.gz
Algorithm Hash digest
SHA256 fa2a7a58a1a73a5cf9b1826bc9dd6d7e7a6da63d7edcdd6d14acf97a787caad8
MD5 0d92b8cf004d938a78957279d0e89b7f
BLAKE2b-256 7c4862e1b2b6bb530582f28946d814bf53111a1913edd1cb99a7b3102967e6a8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page