Skip to main content

Framework for large language model evaluations

Project description

Welcome to Inspect, a framework for large language model evaluations created by the UK AI Security Institute.

Inspect provides many built-in components, including facilities for prompt engineering, tool usage, multi-turn dialog, and model graded evaluations. Extensions to Inspect (e.g. to support new elicitation and scoring techniques) can be provided by other Python packages.

To get started with Inspect, please see the documentation at https://inspect.aisi.org.uk/.

Inspect also includes a collection of over 200 pre-built evaluations ready to run on any model (learn more at https://inspect.aisi.org.uk/evals/).

Coding agents: a structured index of the docs is published at https://inspect.aisi.org.uk/llms.txt. The user guide is concatenated as Markdown at https://inspect.aisi.org.uk/llms-guide.txt, and https://inspect.aisi.org.uk/llms-full.txt additionally bundles the API and CLI reference. Individual pages are available as Markdown by appending .md to the .html path (e.g. /extensions/index.html.md).


To work on development of Inspect, clone the repository and install with the -e flag and [dev] optional dependencies:

git clone https://github.com/UKGovernmentBEIS/inspect_ai.git
cd inspect_ai
pip install -e ".[dev]"

Alternatively, if you use uv, sync the development environment from the checked-in lockfile:

uv sync --extra dev

The uv workflow is supported but not required. The uv.lock file records a reproducible development resolution; project dependencies are still declared in requirements*.txt and exposed through pyproject.toml. When changing dependencies, update the appropriate requirements file and refresh the lockfile rather than relying on uv add.

Optionally install pre-commit hooks via

make hooks

Run linting, formatting, and tests via

make check
make test

When working in a uv-managed environment, prefix those commands with uv run (for example, uv run make check).

If you use VS Code, you should be sure to have installed the recommended extensions (Python, Ruff, and MyPy). Note that you'll be prompted to install these when you open the project in VS Code.

Frontend development (TypeScript)

The web UI lives in a git submodule at src/inspect_ai/_view/ts-mono/. These steps are only needed if you plan to work on the TypeScript/React frontend — Python-only contributors can skip this entirely.

Initialize the submodule and install dependencies — see the one-time setup guide.

Documentation

To work on the Inspect documentation, install the optional [doc] dependencies with the -e flag and build the docs:

pip install -e ".[doc]"
cd docs
quarto render # or 'quarto preview'

If you intend to work on the docs iteratively, you'll want to install the Quarto extension in VS Code.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inspect_ai-0.3.225.tar.gz (46.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inspect_ai-0.3.225-py3-none-any.whl (36.6 MB view details)

Uploaded Python 3

File details

Details for the file inspect_ai-0.3.225.tar.gz.

File metadata

  • Download URL: inspect_ai-0.3.225.tar.gz
  • Upload date:
  • Size: 46.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for inspect_ai-0.3.225.tar.gz
Algorithm Hash digest
SHA256 73ad4f1f116881639eb5c0ba2b024daafbf2dac0718c59b943e4467bb19c3ca9
MD5 0aa8ad3122b332c579e68a4d5b0834f6
BLAKE2b-256 89e91bf19c1f5df1423b1ccb686f067a902197a4008fe01799aa35ddf8253932

See more details on using hashes here.

File details

Details for the file inspect_ai-0.3.225-py3-none-any.whl.

File metadata

  • Download URL: inspect_ai-0.3.225-py3-none-any.whl
  • Upload date:
  • Size: 36.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for inspect_ai-0.3.225-py3-none-any.whl
Algorithm Hash digest
SHA256 b1a277638b13886d4a10ce233f5c598e8fd4430b5ac9c5312c3a71d62f17dcec
MD5 d8c075024cd6d65d9e20cb0a8c7a69ed
BLAKE2b-256 9f76dc09862b2658c95af6ad317c9f125969a42e0a0a3d7b580520a5b23c6877

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page