Chrome DevTools for AI Agents - Real-time debugging, pause, inspect, and step through your AI agent execution
Project description
AgentDBG
Chrome DevTools for AI Agents - Real-time debugging, pause, inspect, and step through your AI agent execution.
The Problem
Building AI agents is hard. Debugging them is harder.
"Your AI agent worked perfectly in testing. Then it hit production and called the wrong tool 14 times in a loop, burned $40 of API credits, and returned gibberish to your user. This is not a rare scenario. It's the default scenario."
Existing observability tools show you what happened after the fact. AgentDBG lets you watch it happen and stop it when things go wrong.
Features
- Real-time Visualization - Watch your agent's execution unfold in real-time
- Pause & Resume - Stop execution at any point to inspect state
- Step-through Debugging - Advance one LLM call at a time
- Breakpoints - Pause on cost thresholds, errors, or custom conditions
- Cost Tracking - Real-time token and cost tracking per span
- Auto-instrumentation - Zero-config support for OpenAI, Anthropic, and LangChain
- Local-first - All data stays on your machine, sub-millisecond overhead
Quick Start
Installation
pip install agentdbg
Basic Usage
Run any Python script with AgentDBG instrumentation:
agentdbg run my_agent.py
This will:
- Auto-instrument OpenAI, Anthropic, and LangChain calls
- Start the debugging UI at http://localhost:8766
- Open your browser to the live trace viewer
Manual Instrumentation
For more control, use the @traced decorator or trace context manager:
from agentdbg import trace, traced, SpanKind
# Using decorator
@traced(name="process_query", kind=SpanKind.AGENT_STEP)
def process_query(query: str) -> str:
# Your agent logic here
return result
# Using context manager
with trace(name="llm_call", kind=SpanKind.LLM_CALL) as span:
response = call_llm(messages)
span.output_data = {"response": response}
CLI Commands
# Run a script with debugging
agentdbg run script.py
# Run with cost limit (pause when exceeded)
agentdbg run script.py --cost-limit 1.0
# Run paused at start
agentdbg run script.py --pause-on-start
# Start server only (for external connections)
agentdbg server
# View recent traces
agentdbg traces
# Show statistics
agentdbg stats
# Clean up old traces
agentdbg cleanup --days 7
Debugger Controls
In the UI
- Pause - Stop execution at the current point
- Resume - Continue execution
- Step - Execute one span and pause again
- Clear - Remove all traces
Breakpoints
Set breakpoints programmatically:
from agentdbg import get_debugger
debugger = get_debugger()
# Pause when cost exceeds $0.50
debugger.state.add_breakpoint(
lambda span: span.cost.total_cost > 0.50
)
# Pause on any error
debugger.state.add_breakpoint(
lambda span: span.error is not None
)
# Pause on specific span name
debugger.state.add_breakpoint(
lambda span: "dangerous_tool" in span.name
)
Cost Tracking
AgentDBG automatically tracks costs for popular models:
from agentdbg.config import MODEL_COSTS
# Supported models:
# - OpenAI: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4, gpt-3.5-turbo, o1, o1-mini
# - Anthropic: claude-3-opus, claude-3-sonnet, claude-3-haiku, claude-3-5-sonnet
Costs are displayed in real-time in the UI and can trigger breakpoints.
Auto-Instrumentation
OpenAI
from openai import OpenAI
from agentdbg.instrumentors import auto_instrument
auto_instrument() # Done automatically by CLI
client = OpenAI()
# All calls are now traced automatically
response = client.chat.completions.create(...)
Anthropic
from anthropic import Anthropic
from agentdbg.instrumentors import auto_instrument
auto_instrument()
client = Anthropic()
# All calls are now traced automatically
response = client.messages.create(...)
LangChain
from langchain_openai import ChatOpenAI
from agentdbg.instrumentors.langchain_instrumentor import AgentDBGCallbackHandler
# Use the callback handler
handler = AgentDBGCallbackHandler()
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
callbacks=[handler],
)
Storage
Traces are stored locally in SQLite:
from agentdbg.storage import SQLiteStorage
storage = SQLiteStorage(".agentdbg/traces.db")
# Get recent traces
traces = storage.get_traces(limit=10)
# Get statistics
stats = storage.get_stats()
print(f"Total cost: ${stats['total_cost']:.2f}")
print(f"Total tokens: {stats['total_tokens']:,}")
# Clean up old data
storage.delete_old_traces(days=7)
Configuration
from agentdbg import DebugConfig, AgentDebugger
config = DebugConfig(
# Server
host="127.0.0.1",
port=8765,
ui_port=8766,
# Auto-pause
auto_pause_on_error=True,
auto_pause_on_cost=1.0, # Pause at $1.00
auto_pause_on_tokens=100000, # Pause at 100k tokens
# Data capture
capture_inputs=True,
capture_outputs=True,
max_input_size=10000,
max_output_size=10000,
)
debugger = AgentDebugger(config=config)
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Your Agent Code │
│ (OpenAI, Anthropic, LangChain, Custom) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ AgentDBG SDK │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Instrumentor│ │ Core │ │ Storage │ │
│ │ (auto-wrap)│ │(trace/span) │ │ (SQLite) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ WebSocket Server │
│ (Real-time streaming) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Web UI │
│ ┌────────────┐ ┌────────────┐ ┌────────────────────────┐ │
│ │ Trace List │ │ Span Tree │ │ Inspector (State/Cost) │ │
│ └────────────┘ └────────────┘ └────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Comparison
| Feature | AgentDBG | Langfuse | LangSmith |
|---|---|---|---|
| Real-time pause/resume | ✅ | ❌ | ❌ |
| Step-through debugging | ✅ | ❌ | ❌ |
| Breakpoints | ✅ | ❌ | ❌ |
| Local-first | ✅ | ⚠️ Self-host | ❌ |
| Zero-config | ✅ | ⚠️ | ⚠️ |
| Open source | ✅ | ✅ | ❌ |
| Cost tracking | ✅ | ✅ | ✅ |
Development
# Clone the repo
git clone https://github.com/agentdbg/agentdbg.git
cd agentdbg
# Install in development mode
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
ruff check src tests
mypy src
Contributing
Contributions are welcome! Please read our Contributing Guide for details.
License
MIT License - see LICENSE for details.
Acknowledgments
Built with frustration and love by developers who've spent too many hours staring at logs wondering why their agent decided to search Google 47 times in a row.
Stop guessing why your agent failed. See every thought. Pause anywhere. Fix it live.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_devtools-0.1.0.tar.gz.
File metadata
- Download URL: agent_devtools-0.1.0.tar.gz
- Upload date:
- Size: 32.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f77fadba5bb25699a4ee01bcf7acca4839ad41854d5f1e2fdcda9fb9b181e25b
|
|
| MD5 |
9046582243d1f6828d69cb9b97f24ed3
|
|
| BLAKE2b-256 |
0f9fd54b446f4162880553112e5d095101592ac9437f8b5bf31d6eb332b2566a
|
File details
Details for the file agent_devtools-0.1.0-py3-none-any.whl.
File metadata
- Download URL: agent_devtools-0.1.0-py3-none-any.whl
- Upload date:
- Size: 30.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bce6b2deeb10adda55378f0b1fbb2ef1feb266ca8ec65224e0f79d443ab3ee19
|
|
| MD5 |
bb42425f92c8c43632fa92e8e4d77cc5
|
|
| BLAKE2b-256 |
f83139d69517ee0b7060d511b958c5483385e94cbd788b9df27831af9ed27ed0
|