Skip to main content

Framework-agnostic observability and debugging for AI agents

Project description

AgentTracer Dashboard

AgentTracer

Framework-agnostic observability and debugging for AI agents.
See every step, tool call, model call, token usage, cost, and decision — in a clean local dashboard and structured logs.

QuickstartFeaturesSDK APIDashboardIntegrationsExamples


Why AgentTracer?

AI agents are black boxes. When an agent fails, you're left staring at logs wondering: What model was called? What did it return? Which tool errored? How much did that run cost? Why did it choose branch A over branch B?

AgentTracer gives you full visibility into every agent run — regardless of framework — with 6 lines of code.

import agenttracer as at

tracer = at.AgentTracer()

with tracer.trace("my-agent", input_data="What's the weather?") as t:
    tracer.log_model_call(t, model="gpt-4o", input_data="...", output_data="...")
    tracer.log_tool_call(t, tool_name="weather", tool_args={"city": "SF"}, tool_result="72°F")
    tracer.log_decision(t, "route", options=["search", "answer"], chosen="answer")

at.dashboard()  # Open http://localhost:8484

Features

  • Framework-agnostic — works with OpenAI, Anthropic, LangChain, LangGraph, AutoGen, or plain Python
  • Simple SDKstart_trace(), log_step(), log_tool_call(), log_model_call(), log_decision(), end_trace()
  • Fork/merge graph view — visualize parallel subagent branches with log_fork(), log_subagent(), log_merge() and an interactive DAG view in the dashboard
  • Context managerswith tracer.trace(...) for automatic lifecycle management
  • Token & cost estimation — auto-estimates tokens and costs for 30+ models (GPT-4o, Claude, Gemini, Llama, etc.)
  • Local dashboard — dark-themed web UI with timeline view, graph view, nested spans, filters, and diff viewer
  • Dual storage — JSONL (append-only, human-readable) + SQLite (indexed queries, fast filtering)
  • Export — download traces as JSON or Markdown reports
  • Auto-patching — wrap OpenAI/Anthropic clients to trace all calls automatically
  • CI/CD ready — GitHub Actions workflows for testing and auto-publish to PyPI on release
  • Zero external dependencies — just Flask for the dashboard, everything else is stdlib

Quickstart

Install

pip install agenttracer-ai

Or install from source:

git clone https://github.com/CrazeXD/agenttracer.git
cd agenttracer
pip install -e .

The PyPI package name is agenttracer-ai, but the import stays import agenttracer.

Run the demo

Generate sample traces and launch the dashboard:

python examples/demo_app.py

Then open http://localhost:8484 in your browser.

Minimal example

import agenttracer as at

tracer = at.AgentTracer(storage_dir="./traces")

# Start a trace
trace = tracer.start_trace("my-agent", input_data="Hello", tags=["demo"])

# Log a model call (tokens and cost auto-estimated)
tracer.log_model_call(
    trace,
    model="gpt-4o",
    input_data=[{"role": "user", "content": "Hello"}],
    output_data="Hi! How can I help?",
)

# Log a tool call
tracer.log_tool_call(
    trace,
    tool_name="search",
    tool_args={"query": "latest news"},
    tool_result={"results": ["..."]},
)

# Log a branching decision
tracer.log_decision(
    trace,
    name="response_strategy",
    options=["concise", "detailed", "follow_up"],
    chosen="concise",
    reasoning="User asked a simple question",
)

# End the trace
tracer.end_trace(trace, output_data="Here's what I found...")

# Launch dashboard
at.dashboard()

SDK API

Core Functions

Function Description
AgentTracer(storage_dir=...) Create a tracer instance
start_trace(agent_name, input_data, tags, metadata) Begin a new trace
end_trace(trace, status, output_data, error) End a trace and persist
log_step(trace, name, input_data, output_data, parent) Log a generic step
log_model_call(trace, model, input_data, output_data, token_usage, ...) Log an LLM call
log_tool_call(trace, tool_name, tool_args, tool_result, error) Log a tool call
log_decision(trace, name, options, chosen, reasoning) Log a branching decision
end_span(span, status, output_data, error) Explicitly end a span

Fork / Merge (Multi-Agent Orchestration)

Trace parallel subagent branches that fork and merge back:

# Fork into parallel branches
fork = tracer.log_fork(trace, "parallel_research", branches=["search", "analysis", "writing"])

# Log each subagent branch (linked by fork_span)
search_sub = tracer.log_subagent(trace, "search-agent", fork_span=fork, input_data="...")
tracer.log_model_call(trace, model="gpt-4o-mini", ..., parent=search_sub)
tracer.end_span(search_sub, output_data="search results")

analysis_sub = tracer.log_subagent(trace, "analysis-agent", fork_span=fork, input_data="...")
tracer.log_tool_call(trace, "code_interpreter", ..., parent=analysis_sub)
tracer.end_span(analysis_sub, output_data="analysis results")

# Merge branches back together
merge = tracer.log_merge(
    trace, "combine_results",
    fork_span=fork,
    source_spans=[search_sub, analysis_sub],
    output_data="merged output",
)
Function Description
log_fork(trace, name, branches) Log a fork point where execution splits into parallel branches
log_subagent(trace, subagent_name, fork_span, input_data) Log a subagent branch — returns a span to use as parent for all work in the branch
log_merge(trace, name, fork_span, source_spans, output_data) Log a merge point where parallel branches rejoin

The dashboard automatically shows a Graph tab with an interactive DAG visualization when a trace contains fork/merge data.

AgentTracer Graph View — fork/merge DAG

See orchestrator_example.py for a complete runnable example.

Context Managers

# Automatic trace lifecycle
with tracer.trace("my-agent", input_data="...") as t:
    # ... all spans auto-closed on exit
    pass  # trace auto-ends with SUCCESS

# Automatic span lifecycle
with tracer.span(trace, "processing") as s:
    result = process(data)
    s.output_data = result

Token Usage

Provide exact token counts or let AgentTracer estimate:

# Auto-estimated (~4 chars/token)
tracer.log_model_call(trace, model="gpt-4o", input_data="...", output_data="...")

# Exact counts from API response
tracer.log_model_call(
    trace,
    model="gpt-4o",
    token_usage={"prompt_tokens": 150, "completion_tokens": 89, "total_tokens": 239},
)

Nested Spans

parent = tracer.log_step(trace, "research")
tracer.log_model_call(trace, model="gpt-4o", ..., parent=parent)
tracer.log_tool_call(trace, "search", ..., parent=parent)
tracer.end_span(parent)

Export

import agenttracer as at

# Export as JSON
json_str = at.export_json(trace)
at.export_json(trace, path="trace.json")  # to file

# Export as Markdown report
md = at.export_markdown(trace)
at.export_markdown(trace, path="report.md")  # to file

CLI

# Launch dashboard
agenttracer dashboard --port 8484

# List recent traces
agenttracer list --agent research-bot --limit 20

# Export a trace
agenttracer export <trace-id> --format json -o trace.json
agenttracer export <trace-id> --format markdown -o report.md

Dashboard

AgentTracer Detail View

The local web dashboard provides:

  • Trace list — see all agent runs with status, duration, tokens, cost, and error counts
  • Filters — filter by agent name, status, errors, and tags
  • Timeline view — nested span tree showing the execution flow
  • Graph view — interactive DAG visualization of fork/merge/subagent branches with color-coded nodes and curved edges (auto-appears when trace has fork/merge data)
  • Expandable spans — click any span to see model, tokens, cost, input/output, tool args/results
  • Input/Output diff — side-by-side view of trace input and output
  • Export buttons — download any trace as JSON or Markdown
  • Auto-refresh — updates every 10 seconds
  • Aggregate stats — total traces, tokens, cost, and agent count in the header

Launch it:

import agenttracer as at
at.dashboard(port=8484)

Or via CLI:

agenttracer dashboard

Integrations

OpenAI

Auto-patch an OpenAI client to trace all chat.completions.create calls:

from agenttracer.integrations.openai_wrapper import patch_openai

client = patch_openai(openai.OpenAI(), tracer, trace)
response = client.chat.completions.create(model="gpt-4o", messages=[...])
# ^ automatically logged with tokens, cost, and any tool calls

Anthropic

from agenttracer.integrations.anthropic_wrapper import patch_anthropic

client = patch_anthropic(anthropic.Anthropic(), tracer, trace)
response = client.messages.create(model="claude-sonnet-4-20250514", messages=[...])

LangChain / LangGraph

Use the callback handler:

from agenttracer.integrations.langchain_callback import AgentTracerCallback

callback = AgentTracerCallback(tracer, trace)
chain.invoke(input, config={"callbacks": [callback]})

# Also works with LangGraph
graph.invoke(input, config={"callbacks": [callback]})

AutoGen

from agenttracer.integrations.autogen_wrapper import trace_autogen_chat

result = trace_autogen_chat(
    tracer, trace,
    initiator=user_proxy,
    recipient=assistant,
    message="Solve this problem...",
)

Plain Python

No framework? No problem. Use the SDK directly:

tracer = at.AgentTracer()
trace = tracer.start_trace("my-script")
tracer.log_step(trace, "step-1", input_data="...")
tracer.log_model_call(trace, model="gpt-4o", ...)
tracer.end_trace(trace)

Cost Estimation

AgentTracer includes pricing data for 30+ models:

Provider Models
OpenAI GPT-4o, GPT-4o-mini, GPT-4, GPT-3.5, o1, o3, o4-mini
Anthropic Claude 4 Opus/Sonnet, Claude 3.5 Sonnet/Haiku, Claude 3 Opus
Google Gemini 2.5 Pro/Flash, Gemini 2.0 Flash
Meta Llama 3.1 70B/8B
Mistral Mixtral 8x7B
DeepSeek DeepSeek V3, DeepSeek R1

Costs are estimated per-call and aggregated per-trace. Update pricing in src/agenttracer/pricing.py.

Storage

Traces are stored in both formats simultaneously:

  • JSONL (traces.jsonl) — append-only, human-readable, grep-friendly
  • SQLite (traces.db) — indexed columns for fast filtering by agent, status, duration, cost, tags

Default storage dir: ./agenttracer_data/

Project Structure

agenttracer/
├── src/agenttracer/
│   ├── __init__.py          # Public API
│   ├── tracer.py            # Core tracer SDK
│   ├── models.py            # Data models (Trace, Span, TokenUsage, etc.)
│   ├── pricing.py           # Token counting & cost estimation
│   ├── __main__.py          # CLI entry point
│   ├── storage/
│   │   ├── jsonl.py         # JSONL storage backend
│   │   └── sqlite.py        # SQLite storage backend
│   ├── exporters/
│   │   ├── json_export.py   # JSON export
│   │   └── markdown_export.py # Markdown report export
│   ├── integrations/
│   │   ├── openai_wrapper.py    # OpenAI auto-patching
│   │   ├── anthropic_wrapper.py # Anthropic auto-patching
│   │   ├── langchain_callback.py # LangChain/LangGraph callback
│   │   └── autogen_wrapper.py   # AutoGen integration
│   └── dashboard/
│       ├── app.py           # Flask dashboard app
│       ├── templates/       # HTML templates
│       └── static/          # CSS, JS & graph.js (SVG DAG renderer)
├── examples/
│   ├── basic_agent.py       # Simple traced agent
│   ├── multi_step_agent.py  # Complex agent with nesting
│   ├── orchestrator_example.py # Multi-agent fork/merge pattern
│   ├── context_manager_demo.py # Context manager API
│   ├── openai_example.py    # OpenAI integration
│   ├── langchain_example.py # LangChain integration
│   └── demo_app.py          # Generate sample data + dashboard
├── .github/workflows/
│   ├── ci.yml               # CI: tests, lint, build check on push/PR
│   └── publish.yml          # Auto-publish to PyPI on GitHub release
├── tests/
├── pyproject.toml
├── LICENSE (MIT)
└── README.md

Examples

Example Description
basic_agent.py Simple agent with model calls, tool calls, and decisions
multi_step_agent.py Multi-step research agent with nested spans and error handling
orchestrator_example.py Multi-agent fork/merge — orchestrator splits into parallel subagents and merges results
context_manager_demo.py Concise API using with statements
openai_example.py Auto-trace OpenAI API calls
langchain_example.py LangChain/LangGraph callback handler
demo_app.py Generate 25 sample traces and launch the dashboard

Contributing

Contributions are welcome. Please open an issue or PR.

# Development setup
git clone https://github.com/CrazeXD/agenttracer.git
cd agenttracer
pip install -e ".[dev]"
python -m pytest tests/ -v

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agenttracer_ai-0.2.0.tar.gz (286.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agenttracer_ai-0.2.0-py3-none-any.whl (47.3 kB view details)

Uploaded Python 3

File details

Details for the file agenttracer_ai-0.2.0.tar.gz.

File metadata

  • Download URL: agenttracer_ai-0.2.0.tar.gz
  • Upload date:
  • Size: 286.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agenttracer_ai-0.2.0.tar.gz
Algorithm Hash digest
SHA256 4ac4910be351cccd879c1d71958f1e21eb1a8bcb2647ef6f51f816da051b7c42
MD5 e7804465fac1d677115a908618040cf3
BLAKE2b-256 45de54a5ab102351d08322409f087d4d7d954817b78e29ebd10b85916dcccdee

See more details on using hashes here.

File details

Details for the file agenttracer_ai-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: agenttracer_ai-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 47.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agenttracer_ai-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0049b102341b20066ff3b9dca4185ff685352925fe14968c927529e7cfcc3516
MD5 028ae3fb41ff4a183140c7da75499241
BLAKE2b-256 90333646bea2a98f3b5701ca4b3729e81ce5883c8a2f064101230fd85db56709

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page