Skip to main content

A flight recorder for AI agents – record every decision, tool call, and failure

Project description

🔲 AgentBlackBox

A flight recorder for AI agents.
Record every decision, tool call, and failure. Replay them later.

PyPI Python License Tests Coverage


The problem

72% of AI agent projects never reach production.

Not because the agents are wrong — but because they're invisible.
You can't debug what you can't see. You can't trust what you can't audit.

AgentBlackBox is the flight recorder your AI agents need.


Install

pip install agentblackbox                 # core only (zero dependencies)
pip install agentblackbox[dashboard]      # + web UI

Requires Python 3.10+.


Quickstart

from agentblackbox import BlackBox

# Drop-in decorator — existing code unchanged
@BlackBox.record(agent_name="researcher")
def run_agent(task: str):
    # your existing agent code here
    ...

run_agent("Summarize today's AI news")

# See what happened
sessions = BlackBox.list_sessions()
BlackBox.replay(sessions[0].session_id)

That's it. Every LLM call, tool use, cost, and error is now recorded locally.

Remote / hosted mode

You can mirror recordings to a hosted AgentBlackBox dashboard while still keeping the local SQLite log:

from agentblackbox import BlackBox
from agentblackbox.remote import RemoteStorage

remote_store = RemoteStorage(
    api_key="abx_...",
    endpoint="https://your-agentblackbox.example.com",
)

with BlackBox.session("researcher", storage=remote_store) as bb:
    bb.record_tool_call("search", {"q": "ai evals"}, {"hits": 12}, 83.4)

To run the dashboard in authenticated cloud mode:

agentblackbox dashboard --cloud

What gets recorded

Event Details
🤖 LLM call model, prompt, output, input/output tokens, cost, latency
🔧 Tool call name, arguments, return value, execution time
Error type, message, full stack trace, timestamp

All data is stored in a local SQLite file (~/.agentblackbox/recordings.db).
Nothing is sent to any external server.


Usage patterns

Decorator

@BlackBox.record(agent_name="coder")
def coding_agent(task):
    ...

Context manager

with BlackBox.session("planner") as bb:
    plan = agent.run(task)
    bb.record_tool_call("search", {"query": task}, result=plan)

Manual recording

with BlackBox.session("custom") as bb:
    bb.record_llm_call(
        model="gpt-4o",
        input_text="Summarize this",
        output_text="Here is a summary...",
        input_tokens=150,
        output_tokens=80,
        duration_ms=400.0,
    )

OpenAI Agents SDK (auto-instrument)

from agentblackbox.integrations import patch_openai_agents
patch_openai_agents()  # All agents recorded automatically

Web Dashboard

agentblackbox dashboard
# → http://localhost:8765
  • Sessions — all runs with status, cost, duration, auto-refreshes every 30s
  • Timeline — step-by-step replay with expandable LLM inputs/outputs
  • Analytics — daily cost trends, per-agent breakdown, model distribution

CLI

agentblackbox sessions                    # list all sessions
agentblackbox replay <session_id>         # console replay
agentblackbox export <session_id>         # JSON export
agentblackbox dashboard --port 8765       # web UI

Cost tracking

Supports 20+ models with automatic cost calculation:

Provider Models
OpenAI gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo, o1, o3-mini
Anthropic claude-3-5-sonnet, claude-3-opus, claude-3-haiku, claude-3-5-haiku

License

MIT © 2026 Takumu Hata

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentblackbox-0.2.0.tar.gz (37.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentblackbox-0.2.0-py3-none-any.whl (34.5 kB view details)

Uploaded Python 3

File details

Details for the file agentblackbox-0.2.0.tar.gz.

File metadata

  • Download URL: agentblackbox-0.2.0.tar.gz
  • Upload date:
  • Size: 37.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for agentblackbox-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c2518a8c3dbf77fdaa0833e86f31fb628f4f9978f5bbc86c578274c35f9cfac3
MD5 41eec3c9608ffc14e44ab731e03d377d
BLAKE2b-256 2c63a6994a78ad1bd7eb323f9ce36682d8ac5210af2d34049f06fe93c00652bd

See more details on using hashes here.

File details

Details for the file agentblackbox-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: agentblackbox-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 34.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for agentblackbox-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7c381484a3f1e745f1faed8cb52cb45c25e29549bd960633cc05635633dcc984
MD5 f2812519657e6c6a38e273d826b6c314
BLAKE2b-256 eaae6132e73acf33edcfb9e5d1f28725672f1764be1cfac1ff1f0dc26840c7be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page