Framework-agnostic observability and debugging for AI agents
Project description
AgentTracer
Framework-agnostic observability and debugging for AI agents.
See every step, tool call, model call, token usage, cost, and decision — in a clean local dashboard and structured logs.
Quickstart • Features • SDK API • Dashboard • Integrations • Examples
Why AgentTracer?
AI agents are black boxes. When an agent fails, you're left staring at logs wondering: What model was called? What did it return? Which tool errored? How much did that run cost? Why did it choose branch A over branch B?
AgentTracer gives you full visibility into every agent run — regardless of framework — with 6 lines of code.
import agenttracer as at
tracer = at.AgentTracer()
with tracer.trace("my-agent", input_data="What's the weather?") as t:
tracer.log_model_call(t, model="gpt-4o", input_data="...", output_data="...")
tracer.log_tool_call(t, tool_name="weather", tool_args={"city": "SF"}, tool_result="72°F")
tracer.log_decision(t, "route", options=["search", "answer"], chosen="answer")
at.dashboard() # Open http://localhost:8484
Features
- Framework-agnostic — works with OpenAI, Anthropic, LangChain, LangGraph, AutoGen, or plain Python
- Simple SDK —
start_trace(),log_step(),log_tool_call(),log_model_call(),log_decision(),end_trace() - Fork/merge graph view — visualize parallel subagent branches with
log_fork(),log_subagent(),log_merge()and an interactive DAG view in the dashboard - Context managers —
with tracer.trace(...)for automatic lifecycle management - Token & cost estimation — auto-estimates tokens and costs for 30+ models (GPT-4o, Claude, Gemini, Llama, etc.)
- Local dashboard — dark-themed web UI with timeline view, graph view, nested spans, filters, and diff viewer
- Dual storage — JSONL (append-only, human-readable) + SQLite (indexed queries, fast filtering)
- Export — download traces as JSON or Markdown reports
- Auto-patching — wrap OpenAI/Anthropic clients to trace all calls automatically
- CI/CD ready — GitHub Actions workflows for testing and auto-publish to PyPI on release
- Zero external dependencies — just Flask for the dashboard, everything else is stdlib
Quickstart
Install
pip install agenttracer-ai
Or install from source:
git clone https://github.com/CrazeXD/agenttracer.git
cd agenttracer
pip install -e .
The PyPI package name is agenttracer-ai, but the import stays import agenttracer.
Run the demo
Generate sample traces and launch the dashboard:
python examples/demo_app.py
Then open http://localhost:8484 in your browser.
Minimal example
import agenttracer as at
tracer = at.AgentTracer(storage_dir="./traces")
# Start a trace
trace = tracer.start_trace("my-agent", input_data="Hello", tags=["demo"])
# Log a model call (tokens and cost auto-estimated)
tracer.log_model_call(
trace,
model="gpt-4o",
input_data=[{"role": "user", "content": "Hello"}],
output_data="Hi! How can I help?",
)
# Log a tool call
tracer.log_tool_call(
trace,
tool_name="search",
tool_args={"query": "latest news"},
tool_result={"results": ["..."]},
)
# Log a branching decision
tracer.log_decision(
trace,
name="response_strategy",
options=["concise", "detailed", "follow_up"],
chosen="concise",
reasoning="User asked a simple question",
)
# End the trace
tracer.end_trace(trace, output_data="Here's what I found...")
# Launch dashboard
at.dashboard()
SDK API
Core Functions
| Function | Description |
|---|---|
AgentTracer(storage_dir=...) |
Create a tracer instance |
start_trace(agent_name, input_data, tags, metadata) |
Begin a new trace |
end_trace(trace, status, output_data, error) |
End a trace and persist |
log_step(trace, name, input_data, output_data, parent) |
Log a generic step |
log_model_call(trace, model, input_data, output_data, token_usage, ...) |
Log an LLM call |
log_tool_call(trace, tool_name, tool_args, tool_result, error) |
Log a tool call |
log_decision(trace, name, options, chosen, reasoning) |
Log a branching decision |
end_span(span, status, output_data, error) |
Explicitly end a span |
Fork / Merge (Multi-Agent Orchestration)
Trace parallel subagent branches that fork and merge back:
# Fork into parallel branches
fork = tracer.log_fork(trace, "parallel_research", branches=["search", "analysis", "writing"])
# Log each subagent branch (linked by fork_span)
search_sub = tracer.log_subagent(trace, "search-agent", fork_span=fork, input_data="...")
tracer.log_model_call(trace, model="gpt-4o-mini", ..., parent=search_sub)
tracer.end_span(search_sub, output_data="search results")
analysis_sub = tracer.log_subagent(trace, "analysis-agent", fork_span=fork, input_data="...")
tracer.log_tool_call(trace, "code_interpreter", ..., parent=analysis_sub)
tracer.end_span(analysis_sub, output_data="analysis results")
# Merge branches back together
merge = tracer.log_merge(
trace, "combine_results",
fork_span=fork,
source_spans=[search_sub, analysis_sub],
output_data="merged output",
)
| Function | Description |
|---|---|
log_fork(trace, name, branches) |
Log a fork point where execution splits into parallel branches |
log_subagent(trace, subagent_name, fork_span, input_data) |
Log a subagent branch — returns a span to use as parent for all work in the branch |
log_merge(trace, name, fork_span, source_spans, output_data) |
Log a merge point where parallel branches rejoin |
The dashboard automatically shows a Graph tab with an interactive DAG visualization when a trace contains fork/merge data.
See orchestrator_example.py for a complete runnable example.
Context Managers
# Automatic trace lifecycle
with tracer.trace("my-agent", input_data="...") as t:
# ... all spans auto-closed on exit
pass # trace auto-ends with SUCCESS
# Automatic span lifecycle
with tracer.span(trace, "processing") as s:
result = process(data)
s.output_data = result
Token Usage
Provide exact token counts or let AgentTracer estimate:
# Auto-estimated (~4 chars/token)
tracer.log_model_call(trace, model="gpt-4o", input_data="...", output_data="...")
# Exact counts from API response
tracer.log_model_call(
trace,
model="gpt-4o",
token_usage={"prompt_tokens": 150, "completion_tokens": 89, "total_tokens": 239},
)
Nested Spans
parent = tracer.log_step(trace, "research")
tracer.log_model_call(trace, model="gpt-4o", ..., parent=parent)
tracer.log_tool_call(trace, "search", ..., parent=parent)
tracer.end_span(parent)
Export
import agenttracer as at
# Export as JSON
json_str = at.export_json(trace)
at.export_json(trace, path="trace.json") # to file
# Export as Markdown report
md = at.export_markdown(trace)
at.export_markdown(trace, path="report.md") # to file
CLI
# Launch dashboard
agenttracer dashboard --port 8484
# List recent traces
agenttracer list --agent research-bot --limit 20
# Export a trace
agenttracer export <trace-id> --format json -o trace.json
agenttracer export <trace-id> --format markdown -o report.md
Dashboard
The local web dashboard provides:
- Trace list — see all agent runs with status, duration, tokens, cost, and error counts
- Filters — filter by agent name, status, errors, and tags
- Timeline view — nested span tree showing the execution flow
- Graph view — interactive DAG visualization of fork/merge/subagent branches with color-coded nodes and curved edges (auto-appears when trace has fork/merge data)
- Expandable spans — click any span to see model, tokens, cost, input/output, tool args/results
- Input/Output diff — side-by-side view of trace input and output
- Export buttons — download any trace as JSON or Markdown
- Auto-refresh — updates every 10 seconds
- Aggregate stats — total traces, tokens, cost, and agent count in the header
Launch it:
import agenttracer as at
at.dashboard(port=8484)
Or via CLI:
agenttracer dashboard
Integrations
OpenAI
Auto-patch an OpenAI client to trace all chat.completions.create calls:
from agenttracer.integrations.openai_wrapper import patch_openai
client = patch_openai(openai.OpenAI(), tracer, trace)
response = client.chat.completions.create(model="gpt-4o", messages=[...])
# ^ automatically logged with tokens, cost, and any tool calls
Anthropic
from agenttracer.integrations.anthropic_wrapper import patch_anthropic
client = patch_anthropic(anthropic.Anthropic(), tracer, trace)
response = client.messages.create(model="claude-sonnet-4-20250514", messages=[...])
LangChain / LangGraph
Use the callback handler:
from agenttracer.integrations.langchain_callback import AgentTracerCallback
callback = AgentTracerCallback(tracer, trace)
chain.invoke(input, config={"callbacks": [callback]})
# Also works with LangGraph
graph.invoke(input, config={"callbacks": [callback]})
AutoGen
from agenttracer.integrations.autogen_wrapper import trace_autogen_chat
result = trace_autogen_chat(
tracer, trace,
initiator=user_proxy,
recipient=assistant,
message="Solve this problem...",
)
Plain Python
No framework? No problem. Use the SDK directly:
tracer = at.AgentTracer()
trace = tracer.start_trace("my-script")
tracer.log_step(trace, "step-1", input_data="...")
tracer.log_model_call(trace, model="gpt-4o", ...)
tracer.end_trace(trace)
Cost Estimation
AgentTracer includes pricing data for 30+ models:
| Provider | Models |
|---|---|
| OpenAI | GPT-4o, GPT-4o-mini, GPT-4, GPT-3.5, o1, o3, o4-mini |
| Anthropic | Claude 4 Opus/Sonnet, Claude 3.5 Sonnet/Haiku, Claude 3 Opus |
| Gemini 2.5 Pro/Flash, Gemini 2.0 Flash | |
| Meta | Llama 3.1 70B/8B |
| Mistral | Mixtral 8x7B |
| DeepSeek | DeepSeek V3, DeepSeek R1 |
Costs are estimated per-call and aggregated per-trace. Update pricing in src/agenttracer/pricing.py.
Storage
Traces are stored in both formats simultaneously:
- JSONL (
traces.jsonl) — append-only, human-readable,grep-friendly - SQLite (
traces.db) — indexed columns for fast filtering by agent, status, duration, cost, tags
Default storage dir: ./agenttracer_data/
Project Structure
agenttracer/
├── src/agenttracer/
│ ├── __init__.py # Public API
│ ├── tracer.py # Core tracer SDK
│ ├── models.py # Data models (Trace, Span, TokenUsage, etc.)
│ ├── pricing.py # Token counting & cost estimation
│ ├── __main__.py # CLI entry point
│ ├── storage/
│ │ ├── jsonl.py # JSONL storage backend
│ │ └── sqlite.py # SQLite storage backend
│ ├── exporters/
│ │ ├── json_export.py # JSON export
│ │ └── markdown_export.py # Markdown report export
│ ├── integrations/
│ │ ├── openai_wrapper.py # OpenAI auto-patching
│ │ ├── anthropic_wrapper.py # Anthropic auto-patching
│ │ ├── langchain_callback.py # LangChain/LangGraph callback
│ │ └── autogen_wrapper.py # AutoGen integration
│ └── dashboard/
│ ├── app.py # Flask dashboard app
│ ├── templates/ # HTML templates
│ └── static/ # CSS, JS & graph.js (SVG DAG renderer)
├── examples/
│ ├── basic_agent.py # Simple traced agent
│ ├── multi_step_agent.py # Complex agent with nesting
│ ├── orchestrator_example.py # Multi-agent fork/merge pattern
│ ├── context_manager_demo.py # Context manager API
│ ├── openai_example.py # OpenAI integration
│ ├── langchain_example.py # LangChain integration
│ └── demo_app.py # Generate sample data + dashboard
├── .github/workflows/
│ ├── ci.yml # CI: tests, lint, build check on push/PR
│ └── publish.yml # Auto-publish to PyPI on GitHub release
├── tests/
├── pyproject.toml
├── LICENSE (MIT)
└── README.md
Examples
| Example | Description |
|---|---|
basic_agent.py |
Simple agent with model calls, tool calls, and decisions |
multi_step_agent.py |
Multi-step research agent with nested spans and error handling |
orchestrator_example.py |
Multi-agent fork/merge — orchestrator splits into parallel subagents and merges results |
context_manager_demo.py |
Concise API using with statements |
openai_example.py |
Auto-trace OpenAI API calls |
langchain_example.py |
LangChain/LangGraph callback handler |
demo_app.py |
Generate 25 sample traces and launch the dashboard |
Contributing
Contributions are welcome. Please open an issue or PR.
# Development setup
git clone https://github.com/CrazeXD/agenttracer.git
cd agenttracer
pip install -e ".[dev]"
python -m pytest tests/ -v
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agenttracer_ai-0.2.0.tar.gz.
File metadata
- Download URL: agenttracer_ai-0.2.0.tar.gz
- Upload date:
- Size: 286.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ac4910be351cccd879c1d71958f1e21eb1a8bcb2647ef6f51f816da051b7c42
|
|
| MD5 |
e7804465fac1d677115a908618040cf3
|
|
| BLAKE2b-256 |
45de54a5ab102351d08322409f087d4d7d954817b78e29ebd10b85916dcccdee
|
File details
Details for the file agenttracer_ai-0.2.0-py3-none-any.whl.
File metadata
- Download URL: agenttracer_ai-0.2.0-py3-none-any.whl
- Upload date:
- Size: 47.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0049b102341b20066ff3b9dca4185ff685352925fe14968c927529e7cfcc3516
|
|
| MD5 |
028ae3fb41ff4a183140c7da75499241
|
|
| BLAKE2b-256 |
90333646bea2a98f3b5701ca4b3729e81ce5883c8a2f064101230fd85db56709
|