Open-source runtime tracing and diagnostics for AI agent execution flows.
Project description
AgentTrace
Open-source runtime tracing and diagnostics for AI agent execution flows.
AgentTrace helps you understand what your agent actually did at runtime — not just whether the final answer looks good.
It is built for people who want to answer questions like:
- Why did the agent call this tool twice?
- Where did the latency actually come from?
- Which fallback path was triggered?
- What did the LLM see before it made this decision?
- Was the execution flow correct, redundant, or suspicious?
If you want something closer to pprof + tracing + agent diagnostics, AgentTrace is designed for that.
Why AgentTrace
Most agent tooling focuses on one of two things:
- output evaluation — “was the answer good?”
- framework abstraction — “how do I build the agent?”
AgentTrace focuses on a different question:
What exactly happened during execution, and why did the agent behave that way?
That makes it especially useful for:
- debugging execution flow
- diagnosing redundancy and fallback behavior
- inspecting LLM prompts / responses in context
- understanding tool usage patterns
- tracing runtime state across a run
Core capabilities
- Trace
LLM / Tool / Skillexecution flows - Capture parallel, retry, fallback, and repeated-call patterns
- Record
Prompt / Response / Context / Plan / Executionsnapshots - Persist runs locally and inspect them in a built-in dashboard
- Review runs with an LLM after execution
- Generate structured diagnostics: critical path, recovery chains, redundant calls, suspicious decisions
What you get
Execution tracing
AgentTrace records a runtime trace for each run, including:
- span type
- start / end time
- latency
- status
- input parameters
- grouping and parent-child relationships
Structured state snapshots
For LLM spans, AgentTrace can capture:
ContextSnapshotMemorySnapshotPlanSnapshotDecisionSnapshotResumeSnapshotExecutionSnapshot
Diagnostics
AgentTrace builds a diagnostics layer on top of the raw trace:
- critical path
- failed tool calls
- recovery chains
- redundant tool clusters
- suspicious decisions
- filtered review findings
LLM review
After each run, AgentTrace can ask an LLM to review the recorded execution flow and flag:
- redundant tool calls
- wrong tool choices
- suspicious fallback behavior
- unnecessary skill execution
- likely execution-flow issues
Review strictness is configurable:
review_level=1→ tolerantreview_level=2→ balanced (default)review_level=3→ strict
At review_level=1/2, the UI hides low severity findings by default.
At review_level=3, all findings are shown.
Dashboard
AgentTrace includes a local dashboard at:
http://localhost:3500
Current UI features include:
- session list
- execution timeline
- parallel-lane view
- collapsed repeated-tool clusters
- prompt / response modal for LLM spans
- execution-state tabs
- diagnostics panel
- LLM review panel
- collapsible final agent output
Quick start
1. Install from source
git clone https://github.com/happli-sys/AgentTrace.git
cd AgentTrace
pip install -e .
2. Patch once, trace every run
import agenttrace
from my_agent import run
agenttrace.patch(
"my_agent.tools",
"my_agent.skills",
"my_agent.llm",
llm_modules=["my_agent.llm"],
skill_modules=["my_agent.skills"],
review_level=2,
)
output = agenttrace.session("查北京天气并计算 1+2")(run)("查北京天气并计算 1+2")
print(output)
print(agenttrace.last_result().summary())
3. Start the dashboard
from agenttrace.dashboard.server import start_server
start_server(port=3500)
Open:
http://localhost:3500
Demo agent
This repo includes a demo agent that intentionally exercises multiple tracing scenarios:
bashreadgrepcalculateget_weatherflaky_weatherweather_report_skill- parallel weather queries
- fallback to stable tools
Run it:
python examples/demo_agent/main.py
Stress prompt:
分析当前目录下的项目;bash pwd;read examples/demo_agent/tools.py;grep calculate examples/demo_agent;查北京和西安的天气,并计算1123123123+1283123;生成北京天气播报;最后总结。
Integration model
AgentTrace works best for:
- custom Python agents with source code
- local development environments
- CLI / hook-based agents
- runtime debugging and diagnostics workflows
The default integration style is intentionally lightweight:
- patch modules once
- wrap runs with
session(...) - inspect results locally
Project scope
AgentTrace is currently optimized as:
- a runtime tracing tool
- a local-first diagnostics tool
- a developer-facing execution inspector
It is not currently focused on being:
- a hosted eval platform
- a benchmark leaderboard
- a dataset management system
- a full SaaS observability suite
Who this is for
AgentTrace is especially useful for:
- engineers building custom agents
- teams debugging real runtime behavior
- people who need local-first execution visibility
- anyone who wants to inspect agent decisions beyond final output quality
Roadmap direction
Current direction is intentionally focused:
- stronger execution tracing
- better diagnostics and issue localization
- cleaner runtime state modeling
- broader integration patterns for source-based agents
- more production-friendly export / observability hooks
The goal is to keep AgentTrace useful as a general execution-flow listener, not to turn it into a bloated all-in-one platform too early.
Still useful for objective metrics
Although AgentTrace centers on tracing and diagnostics, it still retains objective runtime metrics such as:
- total latency
- avg / p95 step latency
- tool success rate
- token usage
- estimated cost
- step efficiency
- correctness (if
expected_outputis provided) - regression tracking
- comparison helpers
Contributing
Contributions are welcome — especially around:
- new agent integrations
- richer diagnostics
- runtime state capture
- dashboard usability
- packaging and release polish
For local development:
git clone https://github.com/happli-sys/AgentTrace.git
cd AgentTrace
pip install -e ".[dev]"
pytest tests/
If you want to contribute, small focused improvements are preferred over large platform-style expansions.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agenttrace_runtime-0.1.0.tar.gz.
File metadata
- Download URL: agenttrace_runtime-0.1.0.tar.gz
- Upload date:
- Size: 63.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ca10e9eb0d3a1feb18aed468b3e85f0b4de12a39f2bcca8ace03a595413e9e5
|
|
| MD5 |
74d1dface626c049b5e4ba230fde47cc
|
|
| BLAKE2b-256 |
75fb052ebf0971325c21836ca4a9b71cca63053c773a1638c3553f4234cf5eb4
|
File details
Details for the file agenttrace_runtime-0.1.0-py3-none-any.whl.
File metadata
- Download URL: agenttrace_runtime-0.1.0-py3-none-any.whl
- Upload date:
- Size: 73.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe428088dbdd7ba7a3536e8852ca267b03fb50f18ede9f3c34b75e0305b84039
|
|
| MD5 |
fc01020d3eba5cb7ddf0e4cd04f7c065
|
|
| BLAKE2b-256 |
6ae9cd3472b89dd2c06b08df908d803befe55cf12d3048f268e09c66e0cf577e
|