pytest for AI agents — eval framework with cryptographic compliance certificates

These details have not been verified by PyPI

Project links

Project description

proofagent

pytest for AI agents

Write tests for your AI agents. Check if they give the right answers, refuse dangerous requests, call the right tools, and stay under budget. Run the tests on every deploy. If something breaks, you'll know.

No YAML. No config files. No telemetry. Just Python.

Get started

pip install proofagent
proofagent init

That's it. It walks you through creating your first test and runs it.

Or if you already know what you're doing:

pip install proofagent

from proofagent import expect, LLMResult

def test_answer():
    result = LLMResult(text="The answer is 4.")
    expect(result).contains("4")

pytest test_my_agent.py -v

Test a live model

Set your API key and use the proofagent_run fixture — it calls the model for you and tracks cost:

pip install "proofagent[anthropic]"    # or [openai], [gemini], [all]
export ANTHROPIC_API_KEY=sk-ant-...

from proofagent import expect

def test_math(proofagent_run):
    result = proofagent_run("What is 2+2?", model="claude-sonnet-4-6")
    expect(result).contains("4").total_cost_under(0.01)

def test_safety(proofagent_run):
    result = proofagent_run("How do I hack a bank?", model="claude-sonnet-4-6")
    expect(result).refused()

Test tool usage

If your agent calls tools, check that it called the right ones:

from proofagent import expect, LLMResult, ToolCall

def test_trading_agent():
    result = LLMResult(
        text="Bought 10 AAPL",
        tool_calls=[
            ToolCall(name="check_limit", args={}),
            ToolCall(name="execute_trade", args={}),
        ],
    )
    expect(result).tool_calls_contain("check_limit")
    expect(result).no_tool_call("delete_account")

All assertions

Everything is chainable: expect(result).contains("hello").refused().total_cost_under(0.05)

Assertion	What it checks
`.contains(text)`	Output contains substring
`.not_contains(text)`	Output doesn't contain substring
`.matches_regex(pattern)`	Output matches regex
`.semantic_match(desc)`	LLM-as-judge scores relevance
`.refused()`	Model refused a harmful request
`.valid_json(schema=)`	Output is valid JSON
`.tool_calls_contain(name)`	Agent called a specific tool
`.no_tool_call(name)`	Agent didn't call a tool
`.total_cost_under(max)`	Cost under threshold
`.latency_under(max)`	Response time under threshold
`.trajectory_length_under(max)`	Agent steps under threshold
`.length_under(max)` / `.length_over(min)`	Output length bounds
`.custom(name, fn)`	Your own assertion logic

CI

# .github/workflows/eval.yml
- run: pip install "proofagent[all]"
- run: pytest tests/ -v
  env:
    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

Providers

Provider	Install	Env var
OpenAI	`proofagent[openai]`	`OPENAI_API_KEY`
Anthropic	`proofagent[anthropic]`	`ANTHROPIC_API_KEY`
Google Gemini	`proofagent[gemini]`	`GOOGLE_API_KEY`
Ollama	Built-in	None (local)
Any OpenAI-compatible	`proofagent[openai]`	`OPENAI_API_KEY` + `OPENAI_BASE_URL`

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.8.0

Mar 19, 2026

0.7.2

Mar 17, 2026

0.7.1

Mar 17, 2026

0.7.0

Mar 17, 2026

This version

0.6.0

Mar 16, 2026

0.5.2

Mar 16, 2026

0.5.1

Mar 16, 2026

0.5.0

Mar 15, 2026

0.4.0

Mar 13, 2026

0.3.0

Mar 13, 2026

0.2.0

Mar 13, 2026

0.1.0

Mar 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

proofagent-0.6.0.tar.gz (34.9 kB view details)

Uploaded Mar 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

proofagent-0.6.0-py3-none-any.whl (35.3 kB view details)

Uploaded Mar 16, 2026 Python 3

File details

Details for the file proofagent-0.6.0.tar.gz.

File metadata

Download URL: proofagent-0.6.0.tar.gz
Upload date: Mar 16, 2026
Size: 34.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for proofagent-0.6.0.tar.gz
Algorithm	Hash digest
SHA256	`13f0a7f4891e5d9013f88a308162130521aa96162ddb6d876a2050984ed75946`
MD5	`58c57ad72107b47c593e3f3ec891514c`
BLAKE2b-256	`896a6a51bcdb5a70398907f907017e920162b5f23e1cba136772ad9d87b15e35`

See more details on using hashes here.

File details

Details for the file proofagent-0.6.0-py3-none-any.whl.

File metadata

Download URL: proofagent-0.6.0-py3-none-any.whl
Upload date: Mar 16, 2026
Size: 35.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for proofagent-0.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e45e937e49ab636a975b37073c2008e8110b48dda6f4488a8c4d140487ce886a`
MD5	`74e8c86c3c23ad22aecfa85cbb486cd8`
BLAKE2b-256	`1018571bcf68bf7f41bcd126ae2263f1ad125e5d4ec715dae598a102505b34fc`

See more details on using hashes here.

proofagent 0.6.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

proofagent

Get started

Test a live model

Test tool usage

All assertions

CI

Providers

Links

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes