Skip to main content

The DVR for AI Agents - Record, visualize, and time-travel through agent execution

Project description

๐Ÿ“ผ Agent VCR

CI Benchmarks codecov PyPI version License: MIT

Time-travel debugging for AI agents.

๐Ÿ“– Documentation โ€ข ๐Ÿš€ Examples


๐Ÿ›‘ The Problem

Building multi-step AI agents (like LangGraph or CrewAI) is painfully slow.

When your agent fails on step 8 out of 10, traditional observability tools only tell you what went wrong. To fix it, you have to patch the prompt or code and re-run all 10 steps from the beginning.

Every typo or logic error costs you minutes of waiting and dollars in wasted LLM tokens.

๐Ÿ’ก The Solution

Agent VCR makes debugging instant.

We record your agent's state at every step. When a failure happens, you simply rewind to the failing step, edit the state to fix the bug, and resume execution from that exact point.

LangSmith and LangFuse show you what happened. Agent VCR lets you change it.

  • ๐Ÿ”Œ Plug & Play: 1-line integration with LangGraph and others.
  • ๐Ÿš€ Zero Overhead: <5ms latency penalty per step.
  • ๐Ÿ“ No Vendor Lock-in: Stores runs locally as git-friendly JSONL.
  • ๐Ÿ”„ Async Native: Built from the ground up for modern asyncio agents.

๐Ÿ”ฅ Quick Start

pip install ai-agent-vcr
from agent_vcr import VCRRecorder, VCRPlayer

# 1. Record your agent (One-time setup)
recorder = VCRRecorder()
recorder.start_session("bug_hunt")
# ... your agent code runs here ...
recorder.save()

# 2. Time-Travel & Fix (The magic part)
player = VCRPlayer.load(".vcr/bug_hunt.vcr")

state = player.goto_frame(2)    # Jump back to step 2
state["prompt"] = "Fixed!"      # Fix the bad state
player.resume(from_frame=2)     # Resume execution from step 2

Features

  • ๐Ÿ”ด Live Recording โ€” Watch your agent execute in real-time via WebSocket
  • โฎ๏ธ Time Travel โ€” Jump to any step, inspect full state
  • โœ๏ธ State Injection โ€” Edit state and resume execution
  • ๐ŸŒณ DAG Visualization โ€” See parallel execution branches
  • ๐Ÿ”Œ Framework Agnostic โ€” Works with LangGraph, CrewAI, or raw Python
  • ๐Ÿ“ Git-Friendly Format โ€” JSONL files, version controllable
  • ๐Ÿš€ Production Performance โ€” <5ms overhead per frame
  • ๐Ÿ”„ Async-First โ€” Full async recorder and player support

Who Is This For?

If you are... Agent VCR helps you...
An AI engineer debugging LangGraph agents Rewind to the exact failing step, fix state, and resume โ€” no re-running the whole chain
A team lead reviewing agent behavior Compare two execution paths side-by-side with full state diffs
A researcher iterating on prompts Fork from any step, change the prompt, and see how downstream behavior changes
Building production agents Record every execution in JSONL for audit trails and regression testing

How Does It Compare?

Feature Agent VCR LangSmith LangFuse Arize Phoenix
Record execution traces โœ… โœ… โœ… โœ…
Time-travel to any step โœ… โŒ โŒ โŒ
Edit state & resume โœ… โŒ โŒ โŒ
Fork from any frame โœ… โŒ โŒ โŒ
Compare execution runs โœ… โœ… โš ๏ธ โš ๏ธ
Self-hosted / local-first โœ… โŒ โœ… โœ…
Git-friendly format (JSONL) โœ… โŒ โŒ โŒ
Framework agnostic โœ… โš ๏ธ LangChain โœ… โœ…
Zero external dependencies โœ… โŒ Cloud โŒ Cloud โœ…
Setup lines 3 ~15 ~10 ~10

Framework Integrations

LangGraph

from langgraph.graph import StateGraph
from agent_vcr import VCRRecorder
from agent_vcr.integrations.langgraph import VCRLangGraph

# Your existing LangGraph code
graph = StateGraph()
graph.add_node("planner", planner_node)
graph.add_node("coder", coder_node)
graph.add_edge("planner", "coder")

# Add VCR recording with one line
recorder = VCRRecorder()
graph = VCRLangGraph(recorder).wrap_graph(graph)

# Run normally โ€” recording happens automatically
result = graph.invoke({"query": "Build a todo app"})

Raw Python

from agent_vcr.integrations.langgraph import vcr_record

recorder = VCRRecorder()

@vcr_record(recorder, node_name="my_function")
def my_function(data):
    return process(data)

# Each call is automatically recorded
result = my_function({"key": "value"})

CrewAI

from crewai import Crew, Agent, Task
from agent_vcr import VCRRecorder
from agent_vcr.integrations.crewai import VCRCrewAI, vcr_task

recorder = VCRRecorder()
recorder.start_session("crew_debug_run")

# Option 1: Wrap the whole crew (auto-records every task)
crew = Crew(agents=[researcher, writer], tasks=[research_task, write_task])
vcr_crew = VCRCrewAI(recorder)
result = vcr_crew.kickoff(crew)

recorder.save()

# Option 2: Decorate individual task functions
@vcr_task(recorder, task_name="research_step")
def research(context: dict) -> str:
    return "findings..."

Install with:

pip install ai-agent-vcr[crewai]

See examples/crewai_integration.py for a full runnable demo.


Storage Format

Agent VCR uses JSONL (JSON Lines) for storage:

{"type": "session", "data": {"session_id": "abc123", "created_at": "2024-01-01T00:00:00Z", ...}}
{"type": "frame", "data": {"frame_id": "...", "node_name": "planner", "input_state": {...}, "output_state": {...}, ...}}
{"type": "frame", "data": {...}}

Benefits:

  • โœ… Human-readable
  • โœ… Git-diffable
  • โœ… Append-only (efficient for streaming)
  • โœ… Line-by-line parsing (no need to load entire file)

Performance

Performance is continuously benchmarked in CI to ensure <5ms recording overhead.

To run the reproducible benchmarks on your own hardware:

pytest tests/benchmarks/ -v

API Reference

VCRRecorder

class VCRRecorder:
def start_session(
    self,
    session_id: str = None,
    parent_session_id: str = None,
    forked_from_frame: int = None,
    metadata: dict = None,
    tags: list[str] = None,
) -> Session

def record_step(
    self,
    node_name: str,
    input_state: dict,
    output_state: dict,
    metadata: FrameMetadata = None,
    frame_type: FrameType = FrameType.NODE_EXECUTION,
) -> Frame

def record_llm_call(...)
def record_tool_call(...)
def record_error(...)
def save(self) -> Path
def fork(self, from_frame: int, ...) -> VCRRecorder

VCRPlayer

class VCRPlayer:
@classmethod
def load(cls, filepath: str) -> VCRPlayer

def goto_frame(self, index: int) -> dict
def get_frame(self, index: int) -> Frame
def list_nodes(self) -> list[str]
def get_errors(self) -> list[Frame]
def compare_frames(self, a: int, b: int) -> dict
def resume(self, agent_callable: Callable, config: ResumeConfig) -> str
def export_state(self, frame_index: int) -> dict

ResumeConfig

class ResumeConfig:
from_frame: int              # Frame to resume from
new_session_id: str = None   # Optional ID for forked session
state_overrides: dict = {}   # State changes to apply
mode: ResumeMode = FORK      # FORK, REPLAY, or MOCK
skip_nodes: list[str] = []   # Nodes to skip during replay
inject_mocks: dict = {}      # Mock values for dependencies

Examples

See the examples/ directory for:

Run an example:

python examples/time_travel_demo.py

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Development Setup

git clone https://github.com/agent-vcr/agent-vcr.git
cd agent-vcr
pip install -e ".[dev]"

Running Tests

# Unit tests
pytest tests/unit/ -v

# Integration tests
pytest tests/integration/ -v

# E2E tests
pytest tests/e2e/ -v

# Benchmarks
pytest tests/benchmarks/ -v

# With coverage
pytest --cov=agent_vcr --cov-report=html

Roadmap

  • Core recording and playback
  • Time-travel resume
  • FastAPI server with WebSocket
  • LangGraph integration
  • Async recorder and player
  • Terminal TUI debugger (vcr-tui)
  • CI/CD integrations
  • React dashboard
  • CrewAI integration
  • AutoGen integration
  • Cloud storage backend
  • Collaborative debugging

License

MIT License โ€” see LICENSE for details.


Acknowledgments

Inspired by:

  • LangSmith โ€” For the observability paradigm
  • GDB โ€” For the time-travel debugging concept
  • Chrome DevTools โ€” For the UX patterns

Built with โค๏ธ by the Agent VCR community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_agent_vcr-0.1.1.tar.gz (51.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_agent_vcr-0.1.1-py3-none-any.whl (30.9 kB view details)

Uploaded Python 3

File details

Details for the file ai_agent_vcr-0.1.1.tar.gz.

File metadata

  • Download URL: ai_agent_vcr-0.1.1.tar.gz
  • Upload date:
  • Size: 51.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for ai_agent_vcr-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7f0cffae5664d99be91f18bf624381b6b7aa2f0a7063504e7d1fecfb20697d24
MD5 45600023e3271ed9c01d03192faa15ae
BLAKE2b-256 1a5a9397dda9642197a26a5637af3f970a30bc01cfd4ade55154210af51596a2

See more details on using hashes here.

File details

Details for the file ai_agent_vcr-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: ai_agent_vcr-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 30.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for ai_agent_vcr-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5e9a494710e9aa25484e71deaa4ce97425b90af9c418c9f9df79102a7b0de378
MD5 9d1b3cd1797c4feeff047318c4658176
BLAKE2b-256 bb014d7384cc360f80288273d2fd62f0bf9ce61f819c80cc9a79199bd95f8644

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page