Deterministic replay debugger for LLM agents
Project description
llmreplay
Deterministic replay debugger for LLM agents. Records LLM calls and tool executions to SQLite, replays from the log with no network calls.
from llmreplay import record, replay
# Record
with record("my_run", seed=42):
response = client.chat.completions.create(...)
# Replay — zero network calls
session = replay("my_run")
for event in session.events():
print(event.step, event.kind, event.payload)
Install
pip install llmreplay
Requirements: Python >= 3.10
What gets recorded
- LLM requests/responses (OpenAI, Anthropic, Grok/xAI, Gemini)
- Tool calls/results (via
@record_tooldecorator) - Random seeds (Python
random, numpy) - Exceptions
Events are stored in ~/.llmreplay/<run_id>.db.
CLI
llmreplay list # List recorded runs
llmreplay view my_run # Show all events
llmreplay view my_run --step 42 # Jump to step
llmreplay cost my_run # Cost breakdown
llmreplay export my_run --json # Export bug report
llmreplay web my_run # Launch timeline UI
Features
Auto-instrumentation — OpenAI, Anthropic, Grok, Gemini, LangChain hooks install automatically within record() context.
Tool mocking — Record tool I/O with @record_tool, replay with ToolMocker:
from llmreplay import ToolMocker, EventStore
mocker = ToolMocker()
mocker.load(EventStore("my_run"))
@mocker.mock(name="fetch_price")
def fetch_price(ticker: str) -> dict: ... # returns recorded result
Regression testing — Run recorded traces against updated code:
from llmreplay import RegressionSuite
suite = RegressionSuite()
@suite.case("run_001")
def check(original, session):
return session.total_cost() <= original["total_cost_usd"] * 1.1
suite.run()
Fork/branch — Copy a trace up to a step for counterfactual debugging:
from llmreplay import fork
new_store = fork("broken_run", "fixed_run", at_step=50)
Fine-tuning export — Export prompt/response pairs:
from llmreplay import export_finetune_dataset
export_finetune_dataset(["run_001", "run_002"], "data.jsonl")
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmreplay-0.1.2.tar.gz.
File metadata
- Download URL: llmreplay-0.1.2.tar.gz
- Upload date:
- Size: 41.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4a8c6bafbaeb2aa431d911c3787e423b42e8cf9911cdac78a089b654ee3d366
|
|
| MD5 |
34ee0ae85e19097d51a7089fa280242b
|
|
| BLAKE2b-256 |
7a6fbd79286e2a5e479c36920519f1ec61b2dede5aa9258c568bf94748760c21
|
File details
Details for the file llmreplay-0.1.2-py3-none-any.whl.
File metadata
- Download URL: llmreplay-0.1.2-py3-none-any.whl
- Upload date:
- Size: 26.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf16eea1ca094b49e61422cb35a36494fb6f124e9464e0eff2d4061a95a41005
|
|
| MD5 |
af1d93c1ba7e95dc2db8dcefcb21a889
|
|
| BLAKE2b-256 |
edbb46c23d948fdd46182d838d7f651521a8245c06adb10dc98d80d8ddba3eec
|