Skip to main content

Gage support for Inspect AI

Project description

Gage Inspect

Gage Inspect extends Inspect AI to support general LLM app development and running tasks in production endpoints. It's designed for programmers who want to build LLM applications that leverage Inspect AI for evaluations.

Inspect AI is open source software used by the AI safety community, AI labs, and the general community for defining and running evaluations.

Gage Inspect works with Gage CLI, a set of command line tools that enable programmer workflows for building and improving Inspect AI tasks.

Gage Inspect is available as open source software under the MIT license.

Visit Gage documentation for a more complete guide to using Gage.

Motivation

Gage integrates with Inspect AI to enable eval drive development. Evaluation support is built into your code from day one. Measure in development and test to improve your application and establish baselines. Measure in production to catch regressions and outliers.

Quick start

To use this library, install it using pip.

pip install gage-inspect

Here's a simple Inspect task that can be run from the command line.

from inspect_ai import Task, task
from inspect_ai.solver import generate, prompt_template
from gage_inspect.task import run_task

@task
def funny():
    return Task(
        solver=[
            prompt_template("Say something funny about {prompt} in 5 words or less"),
            generate(),
        ]
    )

if __name__ == "__main__":
    import sys
    resp = run_task(
        funny(),
        input=sys.argv[1],
        model=sys.argv[2],
    )
    print(resp.completion)

To run this task from the command line, save the code to a file named funny.py.

For OpenAI models, install the openai Python package.

pip install openai

Specify your API key for OpenAI using OPENAI_API_KEY.

export OPENAI_API_KEY='*****'

Run the task from the command line.

python funny.py cats openai/gpt-4.1

Task endpoint

Use FastAPI to create an HTTP endpoint for the task.

Save this code to a file named serve.py:

from fastapi import FastAPI
from gage_inspect.task import run_task
from funny import funny

app = FastAPI()

@app.get("/funny/{topic}")
def get_funny(topic, model="openai/gpt-4.1"):
    resp = run_task(funny(), topic, model=model)
    return resp.completion

This code requires the fastapi[standard] package.

pip install fastapi[standard]

Start an endpoint using the fastapi command.

fastapi run serve.py

Call the task using curl:

curl localhost:8000/funny/cats

For a more detailed example of serving a task, see examples/add.

Evaluate the task

Modify funny.py to add a scorer with sample.

from inspect_ai import Task, task
from inspect_ai.solver import generate, prompt_template
from gage_inspect.dataset import dataset
from gage_inspect.scorer import llm_judge

@task
def funny():
    return Task(
        solver=[
            prompt_template("Say something funny about {prompt} in 5 words or less"),
            generate(),
        ],
        scorer=[llm_judge()],
    )

@dataset
def samples():
    return ["birds", "cows", "cats", "corn", "barns"]

Evaluate this task using Inspect AI.

INSPECT_EVAL_MODEL=openai/gpt-4.1 inspect eval funny.py

Alternative, use the Gage CLI.

Install gage-cli.

pip install gage-cli

Use gage eval to run the task. Gage asks for input and calls Inspect AI to run the eval.

gage eval funny

Use either Inspect AI View to examine the eval logs.

Inspect View is a web app that runs locally.

inspect view

Visit http://127.0.0.1:7575 to view the Inspect logs.

Alternatively, use Gage Review. Gage Review is a terminal based application that provides an alternative interface to Inspect logs.

gage review

For more information on Gage CLI, see the gage-cli project.

  • Use Inspect AI commands for advanced applications or where Gage's simplified interfaces are insufficient.

  • Use Gage CLI for dialog based commands and terminal based log reviews.

Contributing

See our contribution policy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gage_inspect-0.2.0.tar.gz (30.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gage_inspect-0.2.0-py3-none-any.whl (42.2 kB view details)

Uploaded Python 3

File details

Details for the file gage_inspect-0.2.0.tar.gz.

File metadata

  • Download URL: gage_inspect-0.2.0.tar.gz
  • Upload date:
  • Size: 30.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.22

File hashes

Hashes for gage_inspect-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ef3b50552849eb2cc58338e7dfa68188da96d02f59d5d8b91a89105eadabbce6
MD5 30d22760bb163a46a5597ec4a4b38022
BLAKE2b-256 c3b2e60d0f7eaaaef99ad7202a9dcab9bb4163547f946fda7be649c71b637101

See more details on using hashes here.

File details

Details for the file gage_inspect-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for gage_inspect-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bab9fbc0141271ba9e3313618c569b5da2dddaec32f144bec8bae0f4e41dd2c6
MD5 deaca25aae9a67d7d1dbe4658154edac
BLAKE2b-256 17b811e60f61391ab9263f06eb862b88c5e407aca3ca8d99b68b1c04d1d6bc42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page