Skip to main content

The testing framework for AI agents. Fast, framework-agnostic, CI-ready.

Project description

Vigil

The testing framework for AI agents. Fast, framework-agnostic, CI-ready.

Quick Start · Features · Assertions · Agent Types · CI Integration


Vigil makes testing AI agents and LLM applications as easy as writing pytest tests. Write tests in plain Python, run them in CI, catch regressions before production.

pip install vigil-ai

Quick Start

from vigil import test, FunctionAgent, assert_contains, assert_cost_under

# Wrap any function as an agent under test
agent = FunctionAgent(my_chatbot)

@test()
def test_greeting():
    result = agent.run("Hello!")
    assert_contains(result, "hello")
    assert_cost_under(result, 0.01)

@test()
def test_knowledge():
    result = agent.run("What is Python?")
    assert_contains(result, "programming")

Run your tests:

vigil run
# or
pytest --vigil

Features

  • Framework-agnostic — test any agent: Python functions, HTTP APIs, CLI tools
  • Rich assertions — semantic matching, hallucination detection, cost/latency checks
  • Snapshot testing — save golden outputs, detect regressions automatically
  • CI-ready — exit codes, JSON reports, GitHub Actions integration
  • Zero config — works out of the box, configure when you need to
  • Built on pytest — use everything you already know

Assertions

Assertion What it checks
assert_contains(result, text) Output contains expected text
assert_not_contains(result, text) Output does not contain text
assert_json_valid(result) Output is valid JSON
assert_matches_regex(result, pattern) Output matches regex pattern
assert_cost_under(result, max_dollars) API cost below threshold
assert_tokens_under(result, max_tokens) Token usage below threshold
assert_latency_under(result, max_seconds) Response time below threshold
assert_semantic_match(result, ref, threshold) Semantically similar to reference
assert_no_hallucination(result, context) Output grounded in provided context

Agent Types

from vigil import FunctionAgent, HTTPAgent, CLIAgent

# Test a Python function
agent = FunctionAgent(my_function)

# Test an HTTP endpoint
agent = HTTPAgent("http://localhost:8000/chat")

# Test a CLI tool
agent = CLIAgent("python my_agent.py")

Snapshot Testing

from vigil import test, FunctionAgent
from vigil.snapshots import snapshot

agent = FunctionAgent(my_agent)

@test()
def test_output_stable():
    result = agent.run("Summarize this document")
    snapshot(result, name="summary_output")

Update snapshots when outputs intentionally change:

vigil snapshot update

CI Integration

# .github/workflows/ai-tests.yml
name: AI Tests
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - run: pip install vigil-ai
      - run: vigil run --report json > results.json

Configuration

Create a vigil.yaml in your project root:

defaults:
  cost_threshold: 0.05
  latency_threshold: 5.0
  semantic_threshold: 0.85

reporting:
  format: terminal
  verbose: true

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vigil_eval-0.2.0.tar.gz (36.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vigil_eval-0.2.0-py3-none-any.whl (33.9 kB view details)

Uploaded Python 3

File details

Details for the file vigil_eval-0.2.0.tar.gz.

File metadata

  • Download URL: vigil_eval-0.2.0.tar.gz
  • Upload date:
  • Size: 36.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for vigil_eval-0.2.0.tar.gz
Algorithm Hash digest
SHA256 f2f479d70d79b7f1143233091dd8d77df6569d84da86f4a5d34bfc0970b84805
MD5 7b88d27a4de88b623c544e579c42275b
BLAKE2b-256 454c3d7ef7f854faee4d4436a6345d7344c5fbbf281a72bcd7c8483fcf44b668

See more details on using hashes here.

File details

Details for the file vigil_eval-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: vigil_eval-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 33.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for vigil_eval-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ea2cca046bba974c1c62ad445ba2daa1c5ca50fb70e5bf60a82dc8463dc621a2
MD5 14776d069bb16045481b410e4eae2803
BLAKE2b-256 e22fa646d6abb6c91eb3e3a8bbbe442e822453660463b98ab243f501c3a9d007

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page