End-to-end agent testing — assert an agent's tools fired, via Revefi LLM observability
Project description
revefi-agent-test
A small Python harness that asserts an AI agent actually called the tools it was supposed to —
verified against Revefi's LLM observability spans, not the answer text. Works for any agent
instrumented with revefi-llm-sdk.
Per case it: generates a per-run test_prompt_id → POSTs the case body to the agent's url with a
test-prompt-id header → polls Revefi's test-span-data API by that id → checks the run's span for the
required_tools → prints the result. Exit code is non-zero if any case fails.
Layout
revefi_agent_test/__init__.py # all the logic + the CLI (run_test, AgentTestCase, RevefiConfig, main)
revefi_agent_test/__main__.py # lets you do `python -m revefi_agent_test`
examples/raden.yaml # a worked example config you copy and edit
pyproject.toml # package metadata (the `revefi-agent-test` command)
Run it
pip install -e . # from this repo (editable; no PyPI needed)
cp examples/raden.yaml my.yaml # copy the example, then edit the config block + cases
export REVEFI_API_KEY=… # or put api_key in the YAML (env wins over the file)
revefi-agent-test --config my.yaml
The config (one generic format)
A single YAML file: a config block (how to reach Revefi to read spans back) plus cases, each
POSTing a body to a url.
config:
base_url: https://your-revefi-instance.com/v1 # Revefi public API base — used to read spans back
# api_key: "…" # or set REVEFI_API_KEY (env wins); never commit secrets
cases:
- name: web search
url: https://your-agent.example.com/run # the agent endpoint to drive
body: # POSTed to `url` verbatim — shape is up to the agent
input: "Who won the 2024 IPL final? Use web search."
required_tools:
- web_search_tool
body is opaque — it's whatever the agent under test expects. required_tools is the whole point of
the test.
As a library
from revefi_agent_test import RevefiConfig, AgentTestCase, run_test
cfg = RevefiConfig(base_url="https://your-revefi-instance.com/v1", api_key="…")
cases = [
AgentTestCase(
name="web search",
url="https://your-agent.example.com/run",
body={"input": "Who won the 2024 IPL final? Use web search."},
required_tools=["web_search_tool"],
)
]
assert all(r.passed for r in run_test(cfg, cases))
How test_prompt_id reaches the agent
The harness attaches a per-run test-prompt-id header to every agent call. The agent reads it
off its inbound request and forwards it to revefi_llm_sdk.set_request_test_prompt_id(...) so the
run's spans get tagged — in the agent's own request handler, or in whatever gateway fronts it. The
harness never touches the body, so adopting this needs no change to your agent's request schema.
How tools are detected
Verification reads the run's latest SPAN_KIND_CLIENT span from Revefi's test-span-data API and collects
tool names from extractedData.promptsList[*].toolCallsList[*].name (and completionsList[*]),
comparing them against required_tools.
CI
Run it in CI by pointing --config at a YAML committed to your repo and injecting the API token via
a secret (the REVEFI_API_KEY env var) — never commit the token.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file revefi_agent_test-0.1.0.tar.gz.
File metadata
- Download URL: revefi_agent_test-0.1.0.tar.gz
- Upload date:
- Size: 7.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33453e00717905ae163137aed88967f21fa01d2614b8e33dc474a977635dfcdd
|
|
| MD5 |
317910894443f940023e8eedd8550c4a
|
|
| BLAKE2b-256 |
8be110a9c1cd719d68383f871ef9feeb1a052ec025998f2a002f978f78f3baed
|
File details
Details for the file revefi_agent_test-0.1.0-py3-none-any.whl.
File metadata
- Download URL: revefi_agent_test-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8eb946b5eb07c2af89ba66ca61afa7b09a160013f06f4739bccec2dc95e4f1cd
|
|
| MD5 |
58bca05fa719eb3c6be885504fceeed4
|
|
| BLAKE2b-256 |
a343c0cd374615a647540f25a01f83b057fa89728d80beb416704e05941ed62b
|