LLM testing framework for validating agent behavior and tool usage
Project description
LLM Goose 🪿
LLM-powered testing, traces, and live chat for agent workflows
Goose is a Python library, CLI, and web dashboard for testing and debugging LLM agents.
Scaffold a gooseapp/, point Goose at your real query function, write
goose.case(...) tests, then expose tools and live chat when you want the full dashboard loop.
What it looks like
Why Goose?
- Natural-language expectations – Describe the behavior you want and let Goose validate it.
- Tool call assertions – Check what your agent actually did, not just what it said.
- Full execution traces – Inspect messages, tool calls, tool outputs, and validation results.
- Live chat for iteration – Try agents in the dashboard while you develop.
- Hot reload – Re-run against updated code without restarting the app.
Choose your path
- Framework-agnostic quickstart – integrate Goose into an existing Python app, regardless of framework:
docs/getting-started.md - LangChain / LangGraph integration – keep your existing LangChain-style agent and add Goose around it:
docs/integrations/langchain.md
If you are starting from zero, the framework-agnostic quickstart is the default path.
Core workflow
- Scaffold the app with
goose init - Point Goose at
query(...) -> AgentResponseingooseapp/conftest.py - Write cases in
gooseapp/tests/withgoose.case(...) - Run the first loop with
goose test listandgoose test run - Expand into tools, chat, and hot reload through
gooseapp/app.py
Install
Required for the first test run:
pip install llm-goose
Optional for the browser UI:
npm install -g @llm-goose/dashboard-cli
goose init creates the path
goose init
gooseapp/
├── README.md
├── __init__.py
├── app.py
├── conftest.py
└── tests/
├── __init__.py
└── test_example.py
app.pyconfigures tools, live chat agents, and hot reloadconftest.pywires the Goose fixture to your real query functiontests/holds the cases you run from the CLI or dashboard
See docs/goose-init.md for the full scaffold contract.
Minimal first test
from goose.testing import Goose
def test_agent_responds(goose: Goose) -> None:
goose.case(
query="Hello, what can you help me with?",
expectations=[
"Agent responds with a greeting or acknowledgment",
"Agent describes its capabilities or offers assistance",
],
)
goose is injected from the fixture you register in gooseapp/conftest.py. The full query -> fixture -> test path is
documented in docs/getting-started.md.
Key commands
goose init # scaffold gooseapp/
goose test list gooseapp.tests
goose test run gooseapp.tests
goose api
goose-dashboard
Where to go next
- Framework-agnostic quickstart:
docs/getting-started.md - LangChain / LangGraph integration:
docs/integrations/langchain.md - Scaffold details:
docs/goose-init.md - Writing tests:
docs/testing.md - Running the API and CLI loop:
docs/running-goose.md - Using the dashboard:
docs/dashboard.md
License
MIT License – see LICENSE for full text.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_goose-0.2.0.tar.gz.
File metadata
- Download URL: llm_goose-0.2.0.tar.gz
- Upload date:
- Size: 57.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1322eeaf1f8ac9ec3a72d11707229cb8969ac261313d8d8b9fc41d3a367ad813
|
|
| MD5 |
d396532ab3fb78991344e8d2a6f706b2
|
|
| BLAKE2b-256 |
101eef07a52d34e25a6da090d6e0e28ea7be4bce325b694f1ea8929fbb640f43
|
File details
Details for the file llm_goose-0.2.0-py3-none-any.whl.
File metadata
- Download URL: llm_goose-0.2.0-py3-none-any.whl
- Upload date:
- Size: 74.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e63057df2420d1df3d3e5d72a5146fe3dbf87b022969cd75762cfbef96ca48d
|
|
| MD5 |
aaeb0823ead1036cfa3b312ee1cca6fe
|
|
| BLAKE2b-256 |
4ed3796f6db2b368302de74ee47e90bcb8cc776f3c4d9758061916e44ca2fbe5
|