Gage support for Inspect AI
Project description
Gage Inspect
Gage Inspect extends Inspect AI to support general LLM app development and running tasks in production endpoints. It's designed for programmers who want to build LLM applications that leverage Inspect AI for evaluations.
Inspect AI is open source software used by the AI safety community, AI labs, and the general community for defining and running evaluations.
Gage Inspect works with Gage CLI, a set of command line tools that enable programmer workflows for building and improving Inspect AI tasks.
Gage Inspect is available as open source software under the MIT license.
Visit Gage documentation for a more complete guide to using Gage.
Motivation
Gage integrates with Inspect AI to enable eval drive development. Evaluation support is built into your code from day one. Measure in development and test to improve your application and establish baselines. Measure in production to catch regressions and outliers.
Quick start
To use this library, install it using pip.
pip install gage-inspect
Here's a simple Inspect task that can be run from the command line.
from inspect_ai import Task, task
from inspect_ai.solver import generate, prompt_template
from gage_inspect.task import run_task
@task
def funny():
return Task(
solver=[
prompt_template("Say something funny about {prompt} in 5 words or less"),
generate(),
]
)
if __name__ == "__main__":
import sys
resp = run_task(
funny(),
input=sys.argv[1],
model=sys.argv[2],
)
print(resp.completion)
To run this task from the command line, save the code to a file named
funny.py.
For OpenAI models, install the openai Python package.
pip install openai
Specify your API key for OpenAI using OPENAI_API_KEY.
export OPENAI_API_KEY='*****'
Run the task from the command line.
python funny.py cats openai/gpt-4.1
Task endpoint
Use FastAPI to create an HTTP endpoint for the task.
Save this code to a file named serve.py:
from fastapi import FastAPI
from gage_inspect.task import run_task
from funny import funny
app = FastAPI()
@app.get("/funny/{topic}")
def get_funny(topic, model="openai/gpt-4.1"):
resp = run_task(funny(), topic, model=model)
return resp.completion
This code requires the fastapi[standard] package.
pip install fastapi[standard]
Start an endpoint using the fastapi command.
fastapi run serve.py
Call the task using curl:
curl localhost:8000/funny/cats
For a more detailed example of serving a task, see
examples/add.
Evaluate the task
Modify funny.py to add a scorer with sample.
from inspect_ai import Task, task
from inspect_ai.solver import generate, prompt_template
from gage_inspect.dataset import dataset
from gage_inspect.scorer import llm_judge
@task
def funny():
return Task(
solver=[
prompt_template("Say something funny about {prompt} in 5 words or less"),
generate(),
],
scorer=llm_judge(),
)
@dataset
def samples():
return ["birds", "cows", "cats", "corn", "barns"]
Evaluate this task using Inspect AI.
INSPECT_EVAL_MODEL=openai/gpt-4.1 inspect eval funny.py
Alternative, use the Gage CLI.
Install gage-cli.
pip install gage-cli
Use gage eval to run the task. Gage asks for input and calls Inspect
AI to run the eval.
gage eval funny
Use either Inspect AI View to examine the eval logs.
Inspect View is a web app that runs locally.
inspect view
Visit http://127.0.0.1:7575 to view the Inspect logs.
Alternatively, use Gage Review. Gage Review is a terminal based application that provides an alternative interface to Inspect logs.
gage review
For more information on Gage CLI, see the gage-cli project.
-
Use Inspect AI commands for advanced applications or where Gage's simplified interfaces are insufficient.
-
Use Gage CLI for dialog based commands and terminal based log reviews.
Contributing
See our contribution policy.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gage_inspect-0.2.1.tar.gz.
File metadata
- Download URL: gage_inspect-0.2.1.tar.gz
- Upload date:
- Size: 30.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ff5a751328bd096c5b8bbe4297e8d0690eaf55a9ed43be755a949edf3abdc1b
|
|
| MD5 |
358b598b1aa7fe894f146cc815106f54
|
|
| BLAKE2b-256 |
2022b7508825775049f6abae7b12f2d20fc7f69ac0df6371a58643ffbe34932c
|
File details
Details for the file gage_inspect-0.2.1-py3-none-any.whl.
File metadata
- Download URL: gage_inspect-0.2.1-py3-none-any.whl
- Upload date:
- Size: 42.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34bdda55f18517feaffe8b920c95c7e4251286b970fea0508eb115af9d5b040d
|
|
| MD5 |
67fc8e931760aeef4090b57f4bc3d047
|
|
| BLAKE2b-256 |
eb9007b7230dd50242d40cb15f60e091f786a25ab19fad7b2300d0834342832d
|