A flight recorder for AI agents – record every decision, tool call, and failure
Project description
🔲 AgentBlackBox
A flight recorder for AI agents.
Record every decision, tool call, and failure. Replay them later.
The problem
72% of AI agent projects never reach production.
Not because the agents are wrong — but because they're invisible.
You can't debug what you can't see. You can't trust what you can't audit.
AgentBlackBox is the flight recorder your AI agents need.
Install
pip install agentblackbox # core only (zero dependencies)
pip install agentblackbox[dashboard] # + web UI
Requires Python 3.10+.
Quickstart
from agentblackbox import BlackBox
# Drop-in decorator — existing code unchanged
@BlackBox.record(agent_name="researcher")
def run_agent(task: str):
# your existing agent code here
...
run_agent("Summarize today's AI news")
# See what happened
sessions = BlackBox.list_sessions()
BlackBox.replay(sessions[0].session_id)
That's it. Every LLM call, tool use, cost, and error is now recorded locally.
Remote / hosted mode
You can mirror recordings to a hosted AgentBlackBox dashboard while still keeping the local SQLite log:
from agentblackbox import BlackBox
from agentblackbox.remote import RemoteStorage
remote_store = RemoteStorage(
api_key="abx_...",
endpoint="https://your-agentblackbox.example.com",
)
with BlackBox.session("researcher", storage=remote_store) as bb:
bb.record_tool_call("search", {"q": "ai evals"}, {"hits": 12}, 83.4)
To run the dashboard in authenticated cloud mode:
agentblackbox dashboard --cloud
What gets recorded
| Event | Details |
|---|---|
| 🤖 LLM call | model, prompt, output, input/output tokens, cost, latency |
| 🔧 Tool call | name, arguments, return value, execution time |
| ❌ Error | type, message, full stack trace, timestamp |
All data is stored in a local SQLite file (~/.agentblackbox/recordings.db).
Nothing is sent to any external server.
Usage patterns
Decorator
@BlackBox.record(agent_name="coder")
def coding_agent(task):
...
Context manager
with BlackBox.session("planner") as bb:
plan = agent.run(task)
bb.record_tool_call("search", {"query": task}, result=plan)
Manual recording
with BlackBox.session("custom") as bb:
bb.record_llm_call(
model="gpt-4o",
input_text="Summarize this",
output_text="Here is a summary...",
input_tokens=150,
output_tokens=80,
duration_ms=400.0,
)
OpenAI Agents SDK (auto-instrument)
from agentblackbox.integrations import patch_openai_agents
patch_openai_agents() # All agents recorded automatically
Web Dashboard
agentblackbox dashboard
# → http://localhost:8765
- Sessions — all runs with status, cost, duration, auto-refreshes every 30s
- Timeline — step-by-step replay with expandable LLM inputs/outputs
- Analytics — daily cost trends, per-agent breakdown, model distribution
CLI
agentblackbox sessions # list all sessions
agentblackbox replay <session_id> # console replay
agentblackbox export <session_id> # JSON export
agentblackbox dashboard --port 8765 # web UI
Cost tracking
Supports 20+ models with automatic cost calculation:
| Provider | Models |
|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo, o1, o3-mini |
| Anthropic | claude-3-5-sonnet, claude-3-opus, claude-3-haiku, claude-3-5-haiku |
License
MIT © 2026 Takumu Hata
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentblackbox-0.2.0.tar.gz.
File metadata
- Download URL: agentblackbox-0.2.0.tar.gz
- Upload date:
- Size: 37.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2518a8c3dbf77fdaa0833e86f31fb628f4f9978f5bbc86c578274c35f9cfac3
|
|
| MD5 |
41eec3c9608ffc14e44ab731e03d377d
|
|
| BLAKE2b-256 |
2c63a6994a78ad1bd7eb323f9ce36682d8ac5210af2d34049f06fe93c00652bd
|
File details
Details for the file agentblackbox-0.2.0-py3-none-any.whl.
File metadata
- Download URL: agentblackbox-0.2.0-py3-none-any.whl
- Upload date:
- Size: 34.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7c381484a3f1e745f1faed8cb52cb45c25e29549bd960633cc05635633dcc984
|
|
| MD5 |
f2812519657e6c6a38e273d826b6c314
|
|
| BLAKE2b-256 |
eaae6132e73acf33edcfb9e5d1f28725672f1764be1cfac1ff1f0dc26840c7be
|