Time-travel debugger for AI agents. Record any production run, replay any failure.

These details have not been verified by PyPI

Project links

Project description

Rewind — Time-Travel Debugger for AI Agents

Record any production run. Replay any failure. Find the exact step that broke.

$ rewind bisect run-good-7f3a run-bad-9b2c

  ✗ First divergence at step 4 (llm_call)
  ─────────────────────────────────────────────────────────────
  Model:    claude-sonnet-4-6 (both runs)
  Provider: anthropic

  Step 1   match  ✓  "Understood, I'll look into the account."
  Step 2   match  ✓  [tool_call: lookup_account → {status: active}]
  Step 3   match  ✓  "The account shows a balance of $1,240."
  Step 4  DIVERGED ✗
    Good:  "I'll proceed with the transfer."
    Bad:   "I cannot complete this action without explicit confirmation."

  Likely cause: prompt drift between runs (system prompt changed 09:14 UTC)
  Steps matched: 3 / 12

The Problem

AI agents fail in production. You can't reproduce it.

The model call that worked in staging silently uses a different system prompt. A tool returns slightly different output under load. A model version rolls out at midnight and three days later your agent starts refusing requests it used to handle. You have logs, but logs show what happened — not why it changed.

Rewind gives you the recording.

The Solution — 4 Commands

# 1. Record any agent run (works with any language or framework)
rewind record python my_agent.py --input '{"query": "process refund #8821"}'

# 2. Replay it locally — zero LLM API cost, deterministic output
rewind replay 7f3a2b

# 3. Find exactly where a bad run diverged from a good one
rewind bisect run-good-7f3a run-bad-9b2c

# 4. Export a cassette to share with your team
rewind export 7f3a2b --output incident-8821.rw

Install

pip install llm-rewind
rewind init      # generates local CA cert for HTTPS interception

That's it. No account, no cloud, no API key for Rewind itself.

Quick Start — First Cassette in 3 Steps

Step 1: Record

export ANTHROPIC_API_KEY=sk-...   # your existing key
rewind record python my_agent.py
# → Session recorded: 7f3a2b9c  (12 LLM calls, 3,241 tokens)

Step 2: Replay (no API key needed)

rewind replay 7f3a2b
# → Replaying 12 steps from cassette... done. Output identical.

Step 3: Inspect

rewind inspect 7f3a2b
# → Rich table: step-by-step model calls, token counts, latency

How It Works

Rewind runs as a local HTTPS proxy (via mitmproxy) that intercepts every LLM API call your agent makes — to OpenAI, Anthropic, or Gemini. Each request and response is stored in a content-addressed blob store (SHA-256, zstd-compressed) with DuckDB metadata. Because the proxy operates at the HTTP layer, Rewind works with any language and any framework — Python, Node.js, Go, LangChain, LlamaIndex, raw SDK calls, all of it.

On replay, Rewind starts the same proxy in replay mode. Incoming requests are matched by a canonical fingerprint (match_key) that strips volatile fields like tool_call_id while preserving semantic content. Matched requests get the exact recorded response bytes — SSE streaming preserved, token counts preserved, latency simulated. No LLM calls go out. The bisect engine walks two session recordings in parallel and identifies the first step where responses diverge, giving you a precise root cause instead of a log diff.

See docs/ARCHITECTURE.md for the full design.

SDK Decorator (Python shortcut)

Don't want to configure a proxy? Use the decorator mode for pure-Python agents:

import rewind

@rewind.session(name="customer_support", mode="record")
async def run_agent(query: str) -> str:
    ...

@rewind.tool  # records non-HTTP tool calls too
def search_database(query: str) -> list[dict]:
    ...

Comparison

Feature	Rewind	LangSmith	Braintrust	Laminar
True deterministic replay	✅	❌	❌	❌
Works with any language	✅	❌ Python/JS	❌ Python/JS	❌ Python
Local / private (no cloud)	✅	❌ cloud	❌ cloud	❌ cloud
Zero-cost replay	✅	❌	❌	❌
Bisect to find divergence	✅	❌	❌	❌
Shareable cassette files	✅ `.rw`	✅ datasets	✅ datasets	✅ datasets
Cost analytics	✅	✅	✅	✅
Open source	✅ MIT	✅ MIT	✅ MIT	✅ Apache

The key difference: LangSmith, Braintrust, and Laminar are observability tools — they show you what happened. Rewind is a debugger — it lets you reproduce and isolate failures deterministically.

Supported Providers

Provider	Recording	Streaming SSE
Anthropic (`api.anthropic.com`)	✅	✅
OpenAI (`api.openai.com`)	✅	✅
Google Gemini	✅	✅

pytest Integration

# Install: pip install llm-rewind

@pytest.mark.rewind(cassette="tests/cassettes/customer_support.rw")
async def test_agent_handles_refund_request():
    result = await run_customer_support_agent("I want a refund")
    assert "refund" in result.lower()

Cassettes are committed to git. Tests run with zero API cost in CI. See docs/testing/STRATEGY.md.

CLI Reference

rewind init                               # generate local CA cert
rewind record <command>                   # record an agent run
rewind replay <session-id>               # replay from cassette
rewind list                              # list recorded sessions
rewind inspect <session-id>              # inspect step details
rewind diff <session-a> <session-b>      # compare two sessions
rewind bisect <good-run> <bad-run>       # find first divergence
rewind export <session-id> [--output f.rw]   # export cassette file
rewind import <cassette.rw>              # import cassette to local DB
rewind stats [--days 30]                 # cost analytics

Contributing

git clone https://github.com/llm-rewind/rewind
cd rewind
pip install -e ".[dev]"
pytest                  # all tests use cassettes — no API key needed
ruff check src/ tests/
mypy src/ --strict

Issues and PRs welcome. See docs/ARCHITECTURE.md for design decisions and ADRs.

License

MIT — see LICENSE.

Built because production AI agent debugging was broken and no one had fixed it yet.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

May 25, 2026

0.2.0 yanked

May 24, 2026

This version

0.1.0

May 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_rewind-0.1.0.tar.gz (57.9 kB view details)

Uploaded May 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_rewind-0.1.0-py3-none-any.whl (31.9 kB view details)

Uploaded May 23, 2026 Python 3

File details

Details for the file llm_rewind-0.1.0.tar.gz.

File metadata

Download URL: llm_rewind-0.1.0.tar.gz
Upload date: May 23, 2026
Size: 57.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llm_rewind-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2cceeae695242f590fd0131ae06f1cb40e8fd6b59b6fb9a7aebd0771bc5fa7e6`
MD5	`70790fc2d9b5dd846173f79bdec4fab8`
BLAKE2b-256	`ac42c48709d2de2c5f03d59ea88872d94c324f207a6965d0adb44a4b5f801ae1`

See more details on using hashes here.

File details

Details for the file llm_rewind-0.1.0-py3-none-any.whl.

File metadata

Download URL: llm_rewind-0.1.0-py3-none-any.whl
Upload date: May 23, 2026
Size: 31.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llm_rewind-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ef538d6d7dc5d676faa23de5bb30b0004317ce4496887524ce1ba7c7b792ac65`
MD5	`a91fb20291087836c821b3f544ec2bf7`
BLAKE2b-256	`22e00497905b7c26130d1b4094f93da039b30dd26d120703b11a2a18a7b81a50`

See more details on using hashes here.

llm-rewind 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Rewind — Time-Travel Debugger for AI Agents

The Problem

The Solution — 4 Commands

Install

Quick Start — First Cassette in 3 Steps

How It Works

SDK Decorator (Python shortcut)

Comparison

Supported Providers

pytest Integration

CLI Reference

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes