Skip to main content

Deterministic record-and-replay debugger for AI agent runs

Project description

agentrr

Deterministic record-and-replay debugger for AI agent runs.

When an AI agent does something wrong in production, you usually can't reproduce it. Run it again and it takes a different path — the model samples differently, the tool returns different data, and the bug you saw is gone.

agentrr records every nondeterministic boundary an agent crosses — every LLM call, tool call, clock read, and random draw — then replays the run deterministically and offline. The agent's real logic runs again; every external answer is served from the recording. No API calls, no side effects, no cost. You step through the exact failing run as many times as you need.

It's rr / time-travel debugging, for AI agents.

Install (PyPI)

Alpha releases on PyPI:

pip install agentrr
agentrr version

Optional local web UI:

pip install agentrr-ui
agentrr-ui   # http://127.0.0.1:8765 — see docs/ui.md

Quick start (from source)

git clone https://github.com/ip174/agentrr.git
cd agentrr
uv sync --group dev
export PYTHONPATH=examples

1. Record a run

Use python -m … so the log stores a stable entrypoint for replay:

uv run python -m agents.deterministic_support
# run_id: deterministic_support-<id>
# log: .agentrr/runs/deterministic_support-<id>.jsonl

2. Replay in the CLI

uv run agentrr replay deterministic_support-<id>

Entrypoint is read from the log header (0.1.0a2+); override only when needed.

Edit the agent and replay again — strict mode stops at the first divergence:

DivergenceError: divergence at seq 5: signature mismatch

3. Inspect in the web UI (optional)

# dev checkout: install UI + built frontend
cd packages/agentrr-ui/frontend && npm ci && npm run build
cd ../../..
uv pip install -e . -e packages/agentrr-ui

export PYTHONPATH=examples
export AGENTRR_LOG_DIR=.agentrr/runs   # optional; this is the default
agentrr-ui

Open http://127.0.0.1:8765 — pick a session, read What happened, then Check replay and Next to step through AI/tool steps. Replay matched means today's run followed the same path as the recording.

See docs/ui.md for security, nginx, and troubleshooting.

What it guarantees

  • Faithful replay for every captured boundary — the replayed boundary sequence exactly matches the recording (verified in CI).
  • Offline and safe — replay makes zero live LLM calls and never re-executes tools. Replaying an agent that issued a refund does not issue another.
  • Crash-safe recording — an event is durably on disk (fsync) before the agent acts on it. Verified with real SIGKILL in CI. A killed run produces a truncated log, never a holed one.
  • Honest divergence — when replay can't reproduce faithfully, it halts at the exact point and tells you, with a diff. It never silently guesses or serves a mismatched response.

What it does NOT do (by design)

Single-process, synchronous agents. No marketplace, no backend, no hosted service. Concurrency, streaming-chunk replay, and multi-agent pipelines are out of scope for v0.1. See docs/contract.md.

How it works

Layer Recorded Served on replay
LLM calls (OpenAI, Anthropic) full request + response + metadata recorded response
Tool calls name, args, return/error recorded result (tool never runs)
Clock / RNG / IDs every read and draw recorded values, in order

Matching is sequence-primary, signature-validated — no fuzzy search. A request that doesn't match the next expected event is divergence.

Development

uv sync --group dev
export PYTHONPATH=examples
make test          # full suite (excludes durability subdir by default in Makefile)
make durability    # SIGKILL write-before-return gate
make ui-build      # compile React → agentrr_ui/static/
make lint
gitleaks detect    # before you push

Reference agents

Agent Purpose
examples/agents/deterministic_support.py Golden path (mock LLM, registered tools, shims)
examples/agents/unstable_loop.py Unwrapped random — diverges on replay (by design)
examples/agents/tool_caller.py LLM → tool → LLM loop
examples/agents/broken_replay_cases.py Negative scenarios

Docs

Doc Topic
docs/ui.md Web UI install and run
docs/RELEASING.md PyPI release checklist
docs/contract.md Guarantees and exclusions
docs/replay-worker-protocol.md UI worker IPC
CONTRIBUTING.md Contributor workflow

License

Apache-2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentrr-0.1.0a3.tar.gz (23.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentrr-0.1.0a3-py3-none-any.whl (40.8 kB view details)

Uploaded Python 3

File details

Details for the file agentrr-0.1.0a3.tar.gz.

File metadata

  • Download URL: agentrr-0.1.0a3.tar.gz
  • Upload date:
  • Size: 23.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentrr-0.1.0a3.tar.gz
Algorithm Hash digest
SHA256 6c6413e6d709cd812758a1291dacb48f7a4a8de22bd87c7420f7882c8cee73d6
MD5 37dde458c9485bc1c7f3f9933def8624
BLAKE2b-256 6c90475fbb1702037d7e45ea12e11b0f96e25b689ec97b51b78fd7c1dda86cd7

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentrr-0.1.0a3.tar.gz:

Publisher: release.yml on ip174/agentrr

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agentrr-0.1.0a3-py3-none-any.whl.

File metadata

  • Download URL: agentrr-0.1.0a3-py3-none-any.whl
  • Upload date:
  • Size: 40.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentrr-0.1.0a3-py3-none-any.whl
Algorithm Hash digest
SHA256 d7867946ec298797d0b4b95b0748a5bad14313a460495dcd3b5bdab3fddef4b0
MD5 4c8ec5669caa509a5911f78b18074d65
BLAKE2b-256 52b75af1a221b183f8aa3ab0457615022d9bc6aadacdb5578f32735b4d31428b

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentrr-0.1.0a3-py3-none-any.whl:

Publisher: release.yml on ip174/agentrr

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page