Pytest-style behavioral regression testing for AI agents.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ashutosh_023

These details have not been verified by PyPI

Project description

AgentCheck

AgentCheck is pytest for AI agents. Test behavior, not exact text.

Install from PyPI:

pip install pygent-test

Install from source:

python -m pip install -e .

Optional framework extras:

pip install "pygent-test[langgraph]"
pip install "pygent-test[openai]"

What It Does

AgentCheck helps you verify agent behavior such as:

which tools were used
whether tools were used in the expected order
whether the agent stayed within a step budget
whether the agent claimed success without tool evidence
whether behavior regressed against a saved baseline

Current Status

This repo already supports:

repeated-run behavioral tests with @agent_test(...)
local baseline and regression comparison
CLI commands: test, bless, compare, report
pytest integration
a plain Python adapter
an OpenAI Agents SDK adapter
a LangGraph adapter
real live OpenAI agent tests in integration_examples/

Quick Start

python -m pip install -e .
python -m agentcheck.cli test examples

Minimal Example

from agentcheck import agent_test, expect
from examples.booking_agent import SimpleBookingAgent


@agent_test(runs=5, agent_factory=SimpleBookingAgent)
def test_booking_agent(agent: SimpleBookingAgent):
    result = agent.run("Book a table for 2 tonight")

    check = expect(result, collect=True)
    check.used_tool("restaurant_search")
    check.used_tool("booking_tool")
    check.steps_less_than(5)
    check.did_not_claim_confirmation_without_tool("booking_tool")
    check.verify()
    return result

Real Agent Testing

AgentCheck has been exercised against:

real OpenAI Agents SDK agents
real local LangGraph graphs built with StateGraph

Use the included repo live suite:

python -m agentcheck.cli test integration_examples

or:

python -m pytest integration_examples -q

The included live tests cover:

a single-tool weather assistant
a multi-tool research assistant

LangGraph support is tested through the regular unit suite and normalizes the common invoke({"messages": [...]}) flow into AgentResult.

Run the local LangGraph example with:

python -m agentcheck.cli test framework_examples

Documentation

Use these docs depending on what you need:

TECHNICAL_GUIDE.md Detailed developer guide covering architecture, assertions, adapters, and workflows
ADAPTER_GUIDE.md How adapters are structured and how to build a new one
REAL_WORLD_TESTING.md Real OpenAI Agents SDK testing setup and examples
ROADMAP.md What is done, what is next, and what is planned later

Included Demos

Passing local demo:

python -m agentcheck.cli test examples

Intentional failure demo:

python -m agentcheck.cli test regression_examples --fail-on-regression

Commands

python -m agentcheck.cli test <path>
python -m agentcheck.cli bless <path>
python -m agentcheck.cli compare
python -m agentcheck.cli report

Smoke Test

If you are working from a source checkout, run a quick end-to-end validation with:

python scripts/smoke_test.py

To include the live OpenAI integration tests:

python scripts/smoke_test.py --with-live

Every agentcheck test run also writes:

JSON report: .agentcheck/reports/latest.json
Markdown report: .agentcheck/reports/latest.md

Every agentcheck bless <path> stores a suite-specific baseline under .agentcheck/baselines/.

Baselines are guarded against unrelated suites. If the current suite and saved baseline suite do not match exactly, AgentCheck warns instead of comparing them. For older baseline files without suite metadata, it falls back to matching test names.

Pytest

AgentCheck tests can also run through pytest:

python -m pytest examples -q
python -m pytest tests -q
python -m pytest integration_examples -q

Decorated @agent_test(...) functions are collected as AgentCheck test items, and each item still runs its configured repeated-run behavior.

Assertions

Current built-in assertions:

used_tool(...)
used_tool_times(...)
used_tool_at_least(...)
used_tool_at_most(...)
did_not_use_tool(...)
used_tools_in_order([...])
steps_less_than(...)
finished_successfully()
did_not_error()
final_output_contains(...)
final_output_does_not_contain(...)
did_not_claim_confirmation_without_tool(...)

Use fail-fast assertions:

expect(result).used_tool("restaurant_search")

Use collected assertions when you want one run to report multiple failures:

check = expect(result, collect=True)
check.used_tool("restaurant_search")
check.used_tool("booking_tool")
check.did_not_claim_confirmation_without_tool("booking_tool")
check.verify()

Roadmap

This is the first step.

Near-term priorities:

cleaner regression summaries
better onboarding for testing a real agent in under 5 minutes
more adapters based on actual user demand

Longer-term directions:

stronger regression analysis
better flakiness reporting
richer CI workflows
optional hosted features only if the core library proves valuable

For a more detailed breakdown, see ROADMAP.md.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ashutosh_023

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.1

Jun 7, 2026

0.2.2

May 26, 2026

0.1.3

Apr 29, 2026

This version

0.1.2

Apr 28, 2026

0.1.1

Apr 28, 2026

0.1.0

Apr 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygent_test-0.1.2.tar.gz (21.4 kB view details)

Uploaded Apr 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pygent_test-0.1.2-py3-none-any.whl (22.4 kB view details)

Uploaded Apr 28, 2026 Python 3

File details

Details for the file pygent_test-0.1.2.tar.gz.

File metadata

Download URL: pygent_test-0.1.2.tar.gz
Upload date: Apr 28, 2026
Size: 21.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pygent_test-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`4a25716f040439aad604277fd3d3c9d3067efd3d482130d12d3efe5cff0f79fa`
MD5	`fb63c9081f88beadc11a572078222d1e`
BLAKE2b-256	`f8d274735fda6cc1cc878c77ad33add5b62adff5d3d40314db8897abe5546c19`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pygent_test-0.1.2.tar.gz:

Publisher: publish-pypi.yml on ashutosh-rath02/pygent-test

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pygent_test-0.1.2.tar.gz
- Subject digest: 4a25716f040439aad604277fd3d3c9d3067efd3d482130d12d3efe5cff0f79fa
- Sigstore transparency entry: 1397983237
- Sigstore integration time: Apr 28, 2026
Source repository:
- Permalink: ashutosh-rath02/pygent-test@cc64d118c73f1615ee64d1021e0584f6b398a686
- Branch / Tag: refs/heads/main
- Owner: https://github.com/ashutosh-rath02
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@cc64d118c73f1615ee64d1021e0584f6b398a686
- Trigger Event: workflow_dispatch

File details

Details for the file pygent_test-0.1.2-py3-none-any.whl.

File metadata

Download URL: pygent_test-0.1.2-py3-none-any.whl
Upload date: Apr 28, 2026
Size: 22.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pygent_test-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`41641c1038041728747b8d2a5cd0240b46c450886b928e633ed7dd628a829305`
MD5	`dfc88add910a984cb98fb09421e8d505`
BLAKE2b-256	`08f181b42c573f526416bbb7abe0350778b4010a05c527ddd2a921ae86155c57`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pygent_test-0.1.2-py3-none-any.whl:

Publisher: publish-pypi.yml on ashutosh-rath02/pygent-test

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pygent_test-0.1.2-py3-none-any.whl
- Subject digest: 41641c1038041728747b8d2a5cd0240b46c450886b928e633ed7dd628a829305
- Sigstore transparency entry: 1397983251
- Sigstore integration time: Apr 28, 2026
Source repository:
- Permalink: ashutosh-rath02/pygent-test@cc64d118c73f1615ee64d1021e0584f6b398a686
- Branch / Tag: refs/heads/main
- Owner: https://github.com/ashutosh-rath02
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@cc64d118c73f1615ee64d1021e0584f6b398a686
- Trigger Event: workflow_dispatch

pygent-test 0.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

AgentCheck

What It Does

Current Status

Quick Start

Minimal Example

Real Agent Testing

Documentation

Included Demos

Commands

Smoke Test

Pytest

Assertions

Roadmap

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance