Deterministic record-and-replay debugger for AI agent runs
Project description
agentrr
Deterministic record-and-replay debugger for AI agent runs.
When an AI agent does something wrong in production, you usually can't reproduce it. Run it again and it takes a different path — the model samples differently, the tool returns different data, and the bug you saw is gone.
agentrr records every nondeterministic boundary an agent crosses — every LLM call, tool call, clock read, and random draw — then replays the run deterministically and offline. The agent's real logic runs again; every external answer is served from the recording. No API calls, no side effects, no cost. You step through the exact failing run as many times as you need.
It's rr / time-travel debugging, for AI agents.
The 60-second demo
From a clone of this repo:
uv sync --group dev
export PYTHONPATH=examples
Record a run:
uv run python -m agents.deterministic_support
# run_id: deterministic_support-<id>
# log: .agentrr/runs/deterministic_support-<id>.jsonl
Replay it — deterministically, with no network and no live LLM calls (the demo agent uses a mock client; replay never calls it):
uv run agentrr replay deterministic_support-<id> agents.deterministic_support:main
Now edit the agent's prompt and replay again. agentrr halts at the exact boundary where behavior first diverges, with a signature mismatch and a structural diff in the divergence report:
DivergenceError: divergence at seq 5: signature mismatch
Strict mode halts on the first mismatch; use mode="observe" to continue and collect every divergence. The report includes structural diff previews (expected_preview / observed_preview), not the error string alone.
That's the core loop: turn a one-time, irreproducible failure into a fixed artifact you can re-enter and dissect.
What it guarantees
- Faithful replay for every captured boundary — the replayed boundary sequence exactly matches the recording (verified in CI).
- Offline and safe — replay makes zero live LLM calls and never re-executes tools. Replaying an agent that issued a refund does not issue another.
- Crash-safe recording — an event is durably on disk (
fsync) before the agent acts on it. Verified with realSIGKILLin CI. A killed run produces a truncated log, never a holed one. - Honest divergence — when replay can't reproduce faithfully, it halts at the exact point and tells you, with a diff. It never silently guesses or serves a mismatched response.
What it does NOT do (by design)
Single-process, synchronous agents. No marketplace, no backend, no GUI. Concurrency, streaming-chunk replay, and multi-agent pipelines are out of scope for v0.1. See docs/contract.md for the full contract and exclusions.
How it works
agentrr intercepts at the boundaries where an agent touches nondeterminism, and freezes them on replay:
| Layer | Recorded | Served on replay |
|---|---|---|
| LLM calls (OpenAI, Anthropic) | full request + response + metadata | recorded response |
| Tool calls | name, args, return/error | recorded result (tool never runs) |
| Clock / RNG / IDs | every read and draw | recorded values, in order |
Matching is sequence-primary, signature-validated — no fuzzy search, ever. A request that doesn't match the next expected event is divergence, not a thing to paper over.
Install
From source (PyPI publish pending):
git clone https://github.com/<OWNER>/agentrr.git
cd agentrr
uv sync --group dev
export PYTHONPATH=examples # for example agents
Development
uv sync --group dev
make test # full suite incl. credibility gates
make durability # SIGKILL write-before-return gate
gitleaks detect # secret scan before you push
Reference agents
| Agent | Purpose |
|---|---|
examples/agents/deterministic_support.py |
Golden path (mock LLM, registered tools, shims) |
examples/agents/unstable_loop.py |
Unwrapped random — diverges on replay (by design) |
examples/agents/tool_caller.py |
LLM → tool → LLM loop |
examples/agents/broken_replay_cases.py |
Negative scenarios (edited prompt, missing tool, truncated/corrupt log) |
License
Apache-2.0 — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentrr-0.1.0a1.tar.gz.
File metadata
- Download URL: agentrr-0.1.0a1.tar.gz
- Upload date:
- Size: 20.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59fd803254eb8cba8a66410777d6646f84a6a84ae182bc7e978931f92c3108a8
|
|
| MD5 |
b5a2a513fcedcbd46699132b4306925a
|
|
| BLAKE2b-256 |
6ed55a9db45934968ebe224dbca8b3b5f14cc66c28470121a83b80199ebf4ad4
|
Provenance
The following attestation bundles were made for agentrr-0.1.0a1.tar.gz:
Publisher:
release.yml on ip174/agentrr
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentrr-0.1.0a1.tar.gz -
Subject digest:
59fd803254eb8cba8a66410777d6646f84a6a84ae182bc7e978931f92c3108a8 - Sigstore transparency entry: 1661749435
- Sigstore integration time:
-
Permalink:
ip174/agentrr@f65abc1dbbe73c65274951d2adf53c7f48850ba5 -
Branch / Tag:
refs/tags/v0.1.0a1 - Owner: https://github.com/ip174
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f65abc1dbbe73c65274951d2adf53c7f48850ba5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file agentrr-0.1.0a1-py3-none-any.whl.
File metadata
- Download URL: agentrr-0.1.0a1-py3-none-any.whl
- Upload date:
- Size: 37.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7038cad9c6025f3f11e902f13b96bb737766182966dda21a1baca1cf06fca289
|
|
| MD5 |
f53e021a5f89efdf0045adb6df67c8ba
|
|
| BLAKE2b-256 |
3c2e378b0278ebf647c6358ca574e2ef4de7543366fc571e9305dad82f28138b
|
Provenance
The following attestation bundles were made for agentrr-0.1.0a1-py3-none-any.whl:
Publisher:
release.yml on ip174/agentrr
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentrr-0.1.0a1-py3-none-any.whl -
Subject digest:
7038cad9c6025f3f11e902f13b96bb737766182966dda21a1baca1cf06fca289 - Sigstore transparency entry: 1661749623
- Sigstore integration time:
-
Permalink:
ip174/agentrr@f65abc1dbbe73c65274951d2adf53c7f48850ba5 -
Branch / Tag:
refs/tags/v0.1.0a1 - Owner: https://github.com/ip174
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f65abc1dbbe73c65274951d2adf53c7f48850ba5 -
Trigger Event:
push
-
Statement type: