Snapshot tests for AI agents. Record an agent's tool-call trace, diff against a baseline, fail CI on regressions. Python port of @mukundakatta/agentsnap.

These details have not been verified by PyPI

Project links

Project description

agentsnap-py

Snapshot tests for AI agents. Record an agent's tool-call trace, diff it against a baseline, fail CI on regressions. Zero runtime dependencies. Drops into pytest or any test runner.

Python port of @mukundakatta/agentsnap.

Install

pip install agentsnap-py

Usage

from agentsnap import record, trace_tool, expect_snapshot

search = trace_tool("search", lambda q: fetch_results(q))
summarize = trace_tool("summarize", lambda docs: llm_summarize(docs))

def agent(question):
    docs = search(question)
    return summarize(docs)

def test_research_agent_stays_on_rails():
    trace = record(lambda: agent("What is RLHF?"))
    expect_snapshot(trace, "tests/__snapshots__/research.snap.json")

First run writes the snapshot. Every run after that diffs against it. If the agent calls a different tool, calls them in a different order, or starts erroring, the test fails with a readable diff. Regenerate with AGENTSNAP_UPDATE=1.

Async agents

import asyncio
from agentsnap import arecord, trace_tool, expect_snapshot

asearch = trace_tool("search", async_fetch)

async def agent(q):
    return await asearch(q)

def test_async_agent():
    trace = asyncio.run(arecord(lambda: agent("hello")))
    expect_snapshot(trace, "tests/__snapshots__/async.snap.json")

Diff statuses

Status	When	Default action
`PASSED`	Bytewise match	green
`OUTPUT_DRIFT`	Tools + args identical, only output text or external result hashes differ	warn (non-failing)
`TOOLS_REORDERED`	Same tool names, different order	fail
`TOOLS_CHANGED`	Different tool names called, or different args	fail
`REGRESSION`	New error in the trace, or a tool that used to work now throws	fail

Override per snapshot via expect_snapshot(trace, path, fail_on=[...]).

API

`record(fn, *, input=None, model=None, capture_results=False) -> Trace`

Run fn (sync) and capture every trace_tool-wrapped call inside it. Returns a JSON-serializable dict.

`arecord(fn, ...) -> Trace`

Async variant for async def agents. Use with asyncio.run(arecord(lambda: agent())) or inside an async test.

`trace_tool(name, fn) -> wrapped_fn`

Wraps a tool. Inside record, calls go into the trace; outside, transparent pass-through. Works with sync and async tools (returns the same shape).

`expect_snapshot(trace, path, *, update=False, fail_on=None) -> dict`

Compare against an on-disk JSON baseline. Writes if missing, regenerates if AGENTSNAP_UPDATE=1 (or update=True), otherwise diffs and raises AgentSnapshotMismatch on a failing status.

`diff(baseline, current) -> DiffResult`

Low-level diff engine. Returns a DiffResult(status=..., changes=[Change(...)]).

`format_diff(result, path=None) -> str`

Render a colored terminal block for the diff (used in the failure message).

pytest plugin

Installing this package registers a pytest plugin that exposes the same API as fixtures:

def test_my_agent(agentsnap_record, trace_tool, expect_snapshot):
    fn = trace_tool("hello", lambda: "world")
    trace = agentsnap_record(lambda: fn())
    expect_snapshot(trace, "tests/__snapshots__/hello.snap.json")

API differences from the JS sibling

Tracing uses contextvars (Python's AsyncLocalStorage equivalent) instead of node:async_hooks.
Sync agents use record(); async agents use arecord() -- Python doesn't have JS's "async by default" assumption.
Trace is a dict (not a class) so it serializes / inspects naturally.
Change.to_dict() produces the JS-style {"path": ..., "from": ..., "to": ...} -- the dataclass uses from_ because from is a Python keyword.
Adds a pytest plugin (pyproject.toml pytest11 entry point).

See the JS sibling's README for the full design notes.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Apr 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentsnap_py-0.1.0.tar.gz (11.4 kB view details)

Uploaded Apr 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentsnap_py-0.1.0-py3-none-any.whl (12.1 kB view details)

Uploaded Apr 27, 2026 Python 3

File details

Details for the file agentsnap_py-0.1.0.tar.gz.

File metadata

Download URL: agentsnap_py-0.1.0.tar.gz
Upload date: Apr 27, 2026
Size: 11.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for agentsnap_py-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`b5addd94b47aa81e5f2d6b500d3b15eb429a083597b1f9d101902370d2d0f5f8`
MD5	`5bed36528a77200d02cdc5989bfd401c`
BLAKE2b-256	`1a1508f789727648accce377ef2da663871f33865235944f0e511edddac91599`

See more details on using hashes here.

File details

Details for the file agentsnap_py-0.1.0-py3-none-any.whl.

File metadata

Download URL: agentsnap_py-0.1.0-py3-none-any.whl
Upload date: Apr 27, 2026
Size: 12.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for agentsnap_py-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ba3620c75240485f13b25288058a992f73f3edfea7d8b304b515db0724f35735`
MD5	`6051357484bb31b806ddcd7459fb2593`
BLAKE2b-256	`f57d18a4ac514f5293473a8e2267ca705c0ad8be94f7b83180ab2dcd13bd8918`

See more details on using hashes here.

agentsnap-py 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

agentsnap-py

Install

Usage

Async agents

Diff statuses

API

`record(fn, *, input=None, model=None, capture_results=False) -> Trace`

`arecord(fn, ...) -> Trace`

`trace_tool(name, fn) -> wrapped_fn`

`expect_snapshot(trace, path, *, update=False, fail_on=None) -> dict`

`diff(baseline, current) -> DiffResult`

`format_diff(result, path=None) -> str`

pytest plugin

API differences from the JS sibling

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes