The testing framework for AI agents. Fast, framework-agnostic, CI-ready.
Project description
The testing framework for AI agents. Fast, framework-agnostic, CI-ready.
Quick Start · Features · Assertions · Agent Types · CI Integration
Vigil makes testing AI agents and LLM applications as easy as writing pytest tests. Write tests in plain Python, run them in CI, catch regressions before production.
pip install vigil-ai
Quick Start
from vigil import test, FunctionAgent, assert_contains, assert_cost_under
# Wrap any function as an agent under test
agent = FunctionAgent(my_chatbot)
@test()
def test_greeting():
result = agent.run("Hello!")
assert_contains(result, "hello")
assert_cost_under(result, 0.01)
@test()
def test_knowledge():
result = agent.run("What is Python?")
assert_contains(result, "programming")
Run your tests:
vigil run
# or
pytest --vigil
Features
- Framework-agnostic — test any agent: Python functions, HTTP APIs, CLI tools
- Rich assertions — semantic matching, hallucination detection, cost/latency checks
- Snapshot testing — save golden outputs, detect regressions automatically
- CI-ready — exit codes, JSON reports, GitHub Actions integration
- Zero config — works out of the box, configure when you need to
- Built on pytest — use everything you already know
Assertions
| Assertion | What it checks |
|---|---|
assert_contains(result, text) |
Output contains expected text |
assert_not_contains(result, text) |
Output does not contain text |
assert_json_valid(result) |
Output is valid JSON |
assert_matches_regex(result, pattern) |
Output matches regex pattern |
assert_cost_under(result, max_dollars) |
API cost below threshold |
assert_tokens_under(result, max_tokens) |
Token usage below threshold |
assert_latency_under(result, max_seconds) |
Response time below threshold |
assert_semantic_match(result, ref, threshold) |
Semantically similar to reference |
assert_no_hallucination(result, context) |
Output grounded in provided context |
Agent Types
from vigil import FunctionAgent, HTTPAgent, CLIAgent
# Test a Python function
agent = FunctionAgent(my_function)
# Test an HTTP endpoint
agent = HTTPAgent("http://localhost:8000/chat")
# Test a CLI tool
agent = CLIAgent("python my_agent.py")
Snapshot Testing
from vigil import test, FunctionAgent
from vigil.snapshots import snapshot
agent = FunctionAgent(my_agent)
@test()
def test_output_stable():
result = agent.run("Summarize this document")
snapshot(result, name="summary_output")
Update snapshots when outputs intentionally change:
vigil snapshot update
CI Integration
# .github/workflows/ai-tests.yml
name: AI Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- run: pip install vigil-ai
- run: vigil run --report json > results.json
Configuration
Create a vigil.yaml in your project root:
defaults:
cost_threshold: 0.05
latency_threshold: 5.0
semantic_threshold: 0.85
reporting:
format: terminal
verbose: true
License
Apache 2.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vigil_eval-0.2.0.tar.gz.
File metadata
- Download URL: vigil_eval-0.2.0.tar.gz
- Upload date:
- Size: 36.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2f479d70d79b7f1143233091dd8d77df6569d84da86f4a5d34bfc0970b84805
|
|
| MD5 |
7b88d27a4de88b623c544e579c42275b
|
|
| BLAKE2b-256 |
454c3d7ef7f854faee4d4436a6345d7344c5fbbf281a72bcd7c8483fcf44b668
|
File details
Details for the file vigil_eval-0.2.0-py3-none-any.whl.
File metadata
- Download URL: vigil_eval-0.2.0-py3-none-any.whl
- Upload date:
- Size: 33.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ea2cca046bba974c1c62ad445ba2daa1c5ca50fb70e5bf60a82dc8463dc621a2
|
|
| MD5 |
14776d069bb16045481b410e4eae2803
|
|
| BLAKE2b-256 |
e22fa646d6abb6c91eb3e3a8bbbe442e822453660463b98ab243f501c3a9d007
|