Zero-dependency execution tracer and semantic diff engine for LLM agent pipelines
Project description
๐งต Trazo
Execution tracer and semantic diff engine for LLM agent pipelines.
Know exactly why your agent did what it did โ and how it changed.
The Problem
You're building an LLM pipeline. It worked yesterday. Today it's producing different answers, costing more, and you have no idea which call changed. You're staring at raw JSON logs and guessing.
Trazo fixes this.
Before Trazo: After Trazo:
โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
print(response) โ trazo view abc123
grep through logs โ trazo diff abc123 def456
re-run everything โ trazo replay abc123 --span xyz
open Datadog ($$$) โ trazo ui (local, free, instant)
What Trazo Does
- ๐ Traces every function call in your pipeline with zero boilerplate
- ๐ Visualizes execution as a DAG โ see parent/child spans, durations, costs
- ๐ Semantically diffs two runs โ detects what changed and how much
- โฎ๏ธ Replays any span with its exact original inputs (time-travel debugging)
- ๐ฐ Tracks token counts and USD cost per span, per run
- ๐ Local-first โ all data stays on your machine, zero cloud dependencies
- โก Framework-agnostic โ works with OpenAI, Anthropic, raw HTTP, LangChain, any Python
Quickstart
Install
pip install trazo-dev
# For the web UI:
pip install "trazo-dev[ui]"
Instrument in 3 lines
import trazo as tz
tz.init() # โ once, at startup
tz.instrument_ollama() # โ 100% local, no API keys!
@tz.trace # โ on any function
def call_llm(prompt: str) -> str:
# Use ollama, openai, anthropic, or any custom client
return ollama.generate(model="phi3", prompt=prompt)["response"]
with tz.run("my_pipeline"):
result = call_llm("Explain transformers in one sentence")
See what happened
trazo view # list all runs
trazo view abc123 # inspect a specific run
trazo view abc123 --spans # full span tree
trazo diff [id1] [id2] # semantic diff between runs
trazo replay abc123 # re-execute with original inputs
trazo ui # open browser DAG viewer
Core Features
@trazo.trace โ Automatic instrumentation
Decorate any function โ sync or async โ to capture inputs, outputs, timing, and errors:
@trazo.trace
def retrieve_context(query: str, top_k: int = 5) -> list[str]:
return vector_db.search(query, k=top_k)
@trazo.trace(name="llm.generate", tags={"tier": "primary"})
async def async_generate(messages: list[dict]) -> str:
response = await openai_client.chat.completions.create(...)
return response.choices[0].message.content
trazo.span() โ Fine-grained control
Use context managers for manual span control and LLM metadata injection:
with trazo.run("rag_pipeline") as r:
r.tag("experiment", "prompt_v3")
with trazo.span("retrieve", inputs={"query": q}) as s:
docs = vector_db.search(q)
s.set_output({"doc_count": len(docs)})
with trazo.span("generate") as s:
s.set_model("gpt-4o")
s.set_tokens(tokens_in=1240, tokens_out=380)
s.set_cost(0.00412)
response = llm.generate(docs, q)
trazo diff โ Semantic diff between runs
$ trazo diff abc123 def456
Comparing run_v1 against run_v2
Overall similarity: 71.3% โ Similar with changes
Cost delta: +$0.00234
Token delta: +312
Latency delta: +480ms
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโฌโโโโโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโฎ
โ Span โ Kind โ Similarity โ Cost ฮ โ Token ฮ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโผโโโโโโโโโโโค
โ retrieve_context โ โก identicalโ 99.2% โ โ โ โ โ
โ openai.chat/gpt-4o โ โ diverged โ 54.1% โ +$0.0021 โ +289 โ
โ extract_answer โ โ similar โ 78.3% โ โ โ โ โ
โ new_validation_step โ + added โ 0.0% โ +$0.0002 โ +23 โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโฏ
trazo replay โ Time-travel debugging
Re-execute any span with its exact original inputs, with optional overrides:
# Replay with original inputs
trazo replay abc123def456
# Replay with a different model (A/B test)
trazo replay abc123def456 -o model=gpt-4o-mini
# Dry-run: print inputs without executing
trazo replay abc123def456 --dry-run
trazo ui โ Browser DAG viewer
trazo ui
# โ http://localhost:7432
Visualize your full execution DAG with D3.js. Click any node to inspect inputs/outputs. Compare runs. Track cost trends over time.
๐ฆ Ollama & OpenAI Auto-instrumentation
Zero code changes โ just call once at startup to automatically trace models with token counts, execution latency, and cost estimates:
import trazo as tz
tz.init()
# 100% Local, API-free tracing
tz.instrument_ollama()
response = ollama.chat(model="phi3", messages=[...])
# Or cloud providers
tz.instrument_openai()
response = client.chat.completions.create(model="gpt-4o", messages=[...])
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Your Agent Code โ
โ @trazo.trace / trazo.span() / trazo.aspan() โ
โโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ emit TraceEvent (non-blocking)
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ TraceCollector (singleton) โ
โ Thread-safe queue โ background flush worker โ
โ Never blocks your agent's execution path โ
โโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ write
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ StorageEngine (SQLite, WAL mode) โ
โ runs / spans / embeddings tables โ
โ Zero external dependencies โ stdlib only โ
โโโโโโโโโฌโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ-โ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CLI (trazo) โ โ Web UI (FastAPI + D3.js) โ
โ trazo view โ โ http://localhost:7432 โ
โ trazo diff โ โ DAG viz ยท Span inspector ยท Diff panel โ
โ trazo replay โ โ โ
โ trazo export โ โ โ
โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key design decisions:
| Decision | Rationale |
|---|---|
| SQLite storage | Zero setup, works offline, WAL mode for concurrent access |
| ContextVar propagation | Correct parent-child span linking across async boundaries |
| TF n-gram similarity | Semantic diff without requiring an ML model |
| Background flush worker | Tracing never blocks the critical path |
| Framework-agnostic | Monkey-patch integrations are opt-in, not required |
Installation Options
# Minimal (CLI + tracing, no web UI)
pip install trazo-dev
# With web UI
pip install "trazo-dev[ui]"
# With real semantic embeddings (better diff quality)
pip install "trazo-dev[embeddings]"
# Everything
pip install "trazo-dev[ui,embeddings]"
# Development
pip install "trazo-dev[dev,ui]"
pre-commit install
CLI Reference
trazo view [RUN_ID] [--spans] [--limit N] [--db PATH]
trazo diff RUN_A RUN_B [--show-identical] [--db PATH]
trazo replay SPAN_ID [-o KEY=VALUE ...] [--dry-run] [--db PATH]
trazo export RUN_ID [--format json|html] [-o PATH] [--db PATH]
trazo clean [--older-than DAYS] [--keep N] [--run ID] [--all] [--yes]
trazo ui [--host HOST] [--port PORT] [--db PATH]
Supports short IDs โ you never need to type the full UUID.
Python API Reference
import trazo as tz
# Initialization
tz.init(db_path=None)
tz.instrument_ollama()
tz.instrument_openai()
# Tracing
@tz.trace # sync decorator
@tz.trace(name="x", tags={"k": "v"}) # with options
async def fn(): ... # async supported automatically
# Context managers
with trazo.run("name", metadata={}) as r: # top-level run
r.tag("key", "value")
with trazo.span("name", inputs={}) as s: # named span
s.set_model("gpt-4o")
s.set_tokens(100, 50)
s.set_cost(0.00123)
s.set_output({"result": ...})
s.tag("key", "value")
async with trazo.aspan("name") as s: # async span
...
# Inspection
trazo.get_current_span() # active Span | None
trazo.get_current_run() # active Run | None
Extending Trazo
Custom storage backend
from trazo.storage import StorageEngine
from trazo.collector import get_collector
# Use a custom database path
storage = StorageEngine(db_path="/data/my_project/traces.db")
get_collector().configure(storage)
Adding an integration
# Trazo/integrations/anthropic_patch.py
from trazo.tracer import _current_span, _current_run
from trazo.models import Span, SpanStatus
def patch_anthropic():
import anthropic
original_create = anthropic.resources.Messages.create
def patched_create(self, *args, **kwargs):
# ... same pattern as openai_patch.py
pass
anthropic.resources.Messages.create = patched_create
MCP (Model Context Protocol) server
# Expose your traces as an MCP tool
pip install "trazo-dev[mcp]" # coming in v0.2
trazo mcp-serve
Roadmap
- Core tracing engine (
@trazo.trace,trazo.span(),trazo.aspan()) - SQLite storage with WAL mode
- Semantic diff engine (n-gram TF similarity)
- Time-travel replay
- Rich terminal CLI (
trazo view,trazo diff,trazo replay,trazo export) - Web UI with D3.js DAG visualization
- OpenAI auto-instrumentation
- CI: Python 3.10-3.12, Windows/macOS/Linux
- Anthropic auto-instrumentation
- Ollama auto-instrumentation
- MCP server for Claude Desktop / Cursor integration
- LangChain callback integration
- Real semantic embeddings via
sentence-transformers - GitHub Actions diff annotations (fail CI if similarity < threshold)
- VS Code extension
-
trazo watchโ live terminal dashboard
Contributing
We welcome contributions! See CONTRIBUTING.md for details.
git clone https://github.com/trazo-dev/trazo
cd Trazo
pip install -e ".[dev,ui]"
pre-commit install
pytest tests/ -v
Good first issues: look for the good first issue label.
Why Trazo Will Reach 1,000 Stars
| Reason | Detail |
|---|---|
| Universal pain | Every team building LLM apps hits the "why did this change" problem |
| 30-second onboarding | pip install + one decorator = full traces |
| No API key needed | Natively supports Ollama so you can build and trace pipelines completely offline and for free |
| Visual demo hook | The DAG viewer is screenshot-worthy and shareable |
| Zero lock-in | SQLite, MIT license, no cloud, no vendor dependency |
| Framework agnostic | Works with whatever stack you already use |
License
MIT ยฉ 2026 Trazo Contributors
โญ Star on GitHub ยท ๐ Docs ยท ๐ฌ Discord ยท ๐ Issues
Built with love for everyone debugging LLM agents at 2am.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file trazo-0.1.0.tar.gz.
File metadata
- Download URL: trazo-0.1.0.tar.gz
- Upload date:
- Size: 53.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: Hatch/1.16.5 cpython/3.14.4 HTTPX/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
68738dd5a2334b5ac9d7b82ebd470ad32d25cdc718183e89c0a4147e391c5ab8
|
|
| MD5 |
e2c64c13d4d6d66ce93b359d2b09a68c
|
|
| BLAKE2b-256 |
78b85ff60b26f2d8618bca54ae9c807adfb5301d6cc9ad870187194810f2e955
|
File details
Details for the file trazo-0.1.0-py3-none-any.whl.
File metadata
- Download URL: trazo-0.1.0-py3-none-any.whl
- Upload date:
- Size: 52.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: Hatch/1.16.5 cpython/3.14.4 HTTPX/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2acbfc01f667c7c9081ee80611e453096636914f43205989fe528156838a3704
|
|
| MD5 |
9b554de3bc26079136c5cc0e532b8a99
|
|
| BLAKE2b-256 |
b900d59bc16109c78dfe7c2dd62fd1e5d89aba792f8d4e81f083a30e3943003f
|