Skip to main content

Zero-dependency execution tracer and semantic diff engine for LLM agent pipelines

Project description

๐Ÿงต Trazo

Execution tracer and semantic diff engine for LLM agent pipelines.

Know exactly why your agent did what it did โ€” and how it changed.

CI PyPI version Python 3.10+ License: MIT Downloads Discord


The Problem

You're building an LLM pipeline. It worked yesterday. Today it's producing different answers, costing more, and you have no idea which call changed. You're staring at raw JSON logs and guessing.

Trazo fixes this.

Before Trazo:              After Trazo:
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€              โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
print(response)         โ†’      trazo view abc123
grep through logs       โ†’      trazo diff abc123 def456
re-run everything       โ†’      trazo replay abc123 --span xyz
open Datadog ($$$)      โ†’      trazo ui  (local, free, instant)

What Trazo Does

  • ๐Ÿ” Traces every function call in your pipeline with zero boilerplate
  • ๐Ÿ“Š Visualizes execution as a DAG โ€” see parent/child spans, durations, costs
  • ๐Ÿ”€ Semantically diffs two runs โ€” detects what changed and how much
  • โฎ๏ธ Replays any span with its exact original inputs (time-travel debugging)
  • ๐Ÿ’ฐ Tracks token counts and USD cost per span, per run
  • ๐Ÿ”’ Local-first โ€” all data stays on your machine, zero cloud dependencies
  • โšก Framework-agnostic โ€” works with OpenAI, Anthropic, raw HTTP, LangChain, any Python

Quickstart

Install

pip install trazo-dev

# For the web UI:
pip install "trazo-dev[ui]"

Instrument in 3 lines

import trazo as tz

tz.init()  # โ† once, at startup
tz.instrument_ollama() # โ† 100% local, no API keys!

@tz.trace  # โ† on any function
def call_llm(prompt: str) -> str:
    # Use ollama, openai, anthropic, or any custom client
    return ollama.generate(model="phi3", prompt=prompt)["response"]

with tz.run("my_pipeline"):
    result = call_llm("Explain transformers in one sentence")

See what happened

trazo view                    # list all runs
trazo view abc123             # inspect a specific run
trazo view abc123 --spans     # full span tree
trazo diff [id1] [id2]        # semantic diff between runs
trazo replay abc123           # re-execute with original inputs
trazo ui                      # open browser DAG viewer

Core Features

@trazo.trace โ€” Automatic instrumentation

Decorate any function โ€” sync or async โ€” to capture inputs, outputs, timing, and errors:

@trazo.trace
def retrieve_context(query: str, top_k: int = 5) -> list[str]:
    return vector_db.search(query, k=top_k)

@trazo.trace(name="llm.generate", tags={"tier": "primary"})
async def async_generate(messages: list[dict]) -> str:
    response = await openai_client.chat.completions.create(...)
    return response.choices[0].message.content

trazo.span() โ€” Fine-grained control

Use context managers for manual span control and LLM metadata injection:

with trazo.run("rag_pipeline") as r:
    r.tag("experiment", "prompt_v3")

    with trazo.span("retrieve", inputs={"query": q}) as s:
        docs = vector_db.search(q)
        s.set_output({"doc_count": len(docs)})

    with trazo.span("generate") as s:
        s.set_model("gpt-4o")
        s.set_tokens(tokens_in=1240, tokens_out=380)
        s.set_cost(0.00412)
        response = llm.generate(docs, q)

trazo diff โ€” Semantic diff between runs

$ trazo diff abc123 def456

Comparing run_v1 against run_v2
Overall similarity: 71.3% โ€” Similar with changes
Cost delta:    +$0.00234
Token delta:   +312
Latency delta: +480ms

โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Span                     โ”‚ Kind       โ”‚ Similarity โ”‚ Cost ฮ”   โ”‚ Token ฮ”  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ retrieve_context         โ”‚ โ‰ก identicalโ”‚ 99.2%      โ”‚ โ€”        โ”‚ โ€”        โ”‚
โ”‚ openai.chat/gpt-4o       โ”‚ โ‰  diverged โ”‚ 54.1%      โ”‚ +$0.0021 โ”‚ +289     โ”‚
โ”‚ extract_answer           โ”‚ โ‰ˆ similar  โ”‚ 78.3%      โ”‚ โ€”        โ”‚ โ€”        โ”‚
โ”‚ new_validation_step      โ”‚ + added    โ”‚ 0.0%       โ”‚ +$0.0002 โ”‚ +23      โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

trazo replay โ€” Time-travel debugging

Re-execute any span with its exact original inputs, with optional overrides:

# Replay with original inputs
trazo replay abc123def456

# Replay with a different model (A/B test)
trazo replay abc123def456 -o model=gpt-4o-mini

# Dry-run: print inputs without executing
trazo replay abc123def456 --dry-run

trazo ui โ€” Browser DAG viewer

trazo ui
# โ†’ http://localhost:7432

Visualize your full execution DAG with D3.js. Click any node to inspect inputs/outputs. Compare runs. Track cost trends over time.

๐Ÿฆ™ Ollama & OpenAI Auto-instrumentation

Zero code changes โ€” just call once at startup to automatically trace models with token counts, execution latency, and cost estimates:

import trazo as tz

tz.init()

# 100% Local, API-free tracing
tz.instrument_ollama()
response = ollama.chat(model="phi3", messages=[...])

# Or cloud providers
tz.instrument_openai()
response = client.chat.completions.create(model="gpt-4o", messages=[...])

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    Your Agent Code                              โ”‚
โ”‚   @trazo.trace  /  trazo.span()  /  trazo.aspan()                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                        โ”‚ emit TraceEvent (non-blocking)
                        โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              TraceCollector (singleton)                         โ”‚
โ”‚  Thread-safe queue โ†’ background flush worker                    โ”‚
โ”‚  Never blocks your agent's execution path                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ”‚ write
                    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              StorageEngine (SQLite, WAL mode)                   โ”‚
โ”‚  runs / spans / embeddings tables                               โ”‚
โ”‚  Zero external dependencies โ€” stdlib only                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€-โ”˜
        โ”‚                        โ”‚
        โ–ผ                        โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  CLI  (trazo)     โ”‚   โ”‚    Web UI  (FastAPI + D3.js)               โ”‚
โ”‚  trazo view       โ”‚   โ”‚    http://localhost:7432                    โ”‚
โ”‚  trazo diff       โ”‚   โ”‚    DAG viz ยท Span inspector ยท Diff panel   โ”‚
โ”‚  trazo replay     โ”‚   โ”‚                                            โ”‚
โ”‚  trazo export     โ”‚   โ”‚                                            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key design decisions:

Decision Rationale
SQLite storage Zero setup, works offline, WAL mode for concurrent access
ContextVar propagation Correct parent-child span linking across async boundaries
TF n-gram similarity Semantic diff without requiring an ML model
Background flush worker Tracing never blocks the critical path
Framework-agnostic Monkey-patch integrations are opt-in, not required

Installation Options

# Minimal (CLI + tracing, no web UI)
pip install trazo-dev

# With web UI
pip install "trazo-dev[ui]"

# With real semantic embeddings (better diff quality)
pip install "trazo-dev[embeddings]"

# Everything
pip install "trazo-dev[ui,embeddings]"

# Development
pip install "trazo-dev[dev,ui]"
pre-commit install

CLI Reference

trazo view [RUN_ID] [--spans] [--limit N] [--db PATH]
trazo diff RUN_A RUN_B [--show-identical] [--db PATH]
trazo replay SPAN_ID [-o KEY=VALUE ...] [--dry-run] [--db PATH]
trazo export RUN_ID [--format json|html] [-o PATH] [--db PATH]
trazo clean [--older-than DAYS] [--keep N] [--run ID] [--all] [--yes]
trazo ui [--host HOST] [--port PORT] [--db PATH]

Supports short IDs โ€” you never need to type the full UUID.


Python API Reference

import trazo as tz

# Initialization
tz.init(db_path=None)
tz.instrument_ollama()
tz.instrument_openai()

# Tracing
@tz.trace                                 # sync decorator
@tz.trace(name="x", tags={"k": "v"})     # with options
async def fn(): ...                        # async supported automatically

# Context managers
with trazo.run("name", metadata={}) as r:    # top-level run
    r.tag("key", "value")

with trazo.span("name", inputs={}) as s:     # named span
    s.set_model("gpt-4o")
    s.set_tokens(100, 50)
    s.set_cost(0.00123)
    s.set_output({"result": ...})
    s.tag("key", "value")

async with trazo.aspan("name") as s:         # async span
    ...

# Inspection
trazo.get_current_span()                     # active Span | None
trazo.get_current_run()                      # active Run | None

Extending Trazo

Custom storage backend

from trazo.storage import StorageEngine
from trazo.collector import get_collector

# Use a custom database path
storage = StorageEngine(db_path="/data/my_project/traces.db")
get_collector().configure(storage)

Adding an integration

# Trazo/integrations/anthropic_patch.py
from trazo.tracer import _current_span, _current_run
from trazo.models import Span, SpanStatus

def patch_anthropic():
    import anthropic
    original_create = anthropic.resources.Messages.create
    def patched_create(self, *args, **kwargs):
        # ... same pattern as openai_patch.py
        pass
    anthropic.resources.Messages.create = patched_create

MCP (Model Context Protocol) server

# Expose your traces as an MCP tool
pip install "trazo-dev[mcp]"   # coming in v0.2
trazo mcp-serve

Roadmap

  • Core tracing engine (@trazo.trace, trazo.span(), trazo.aspan())
  • SQLite storage with WAL mode
  • Semantic diff engine (n-gram TF similarity)
  • Time-travel replay
  • Rich terminal CLI (trazo view, trazo diff, trazo replay, trazo export)
  • Web UI with D3.js DAG visualization
  • OpenAI auto-instrumentation
  • CI: Python 3.10-3.12, Windows/macOS/Linux
  • Anthropic auto-instrumentation
  • Ollama auto-instrumentation
  • MCP server for Claude Desktop / Cursor integration
  • LangChain callback integration
  • Real semantic embeddings via sentence-transformers
  • GitHub Actions diff annotations (fail CI if similarity < threshold)
  • VS Code extension
  • trazo watch โ€” live terminal dashboard

Contributing

We welcome contributions! See CONTRIBUTING.md for details.

git clone https://github.com/trazo-dev/trazo
cd Trazo
pip install -e ".[dev,ui]"
pre-commit install
pytest tests/ -v

Good first issues: look for the good first issue label.


Why Trazo Will Reach 1,000 Stars

Reason Detail
Universal pain Every team building LLM apps hits the "why did this change" problem
30-second onboarding pip install + one decorator = full traces
No API key needed Natively supports Ollama so you can build and trace pipelines completely offline and for free
Visual demo hook The DAG viewer is screenshot-worthy and shareable
Zero lock-in SQLite, MIT license, no cloud, no vendor dependency
Framework agnostic Works with whatever stack you already use

License

MIT ยฉ 2026 Trazo Contributors


โญ Star on GitHub ยท ๐Ÿ“– Docs ยท ๐Ÿ’ฌ Discord ยท ๐Ÿ› Issues

Built with love for everyone debugging LLM agents at 2am.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trazo-0.1.0.tar.gz (53.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trazo-0.1.0-py3-none-any.whl (52.2 kB view details)

Uploaded Python 3

File details

Details for the file trazo-0.1.0.tar.gz.

File metadata

  • Download URL: trazo-0.1.0.tar.gz
  • Upload date:
  • Size: 53.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Hatch/1.16.5 cpython/3.14.4 HTTPX/0.28.1

File hashes

Hashes for trazo-0.1.0.tar.gz
Algorithm Hash digest
SHA256 68738dd5a2334b5ac9d7b82ebd470ad32d25cdc718183e89c0a4147e391c5ab8
MD5 e2c64c13d4d6d66ce93b359d2b09a68c
BLAKE2b-256 78b85ff60b26f2d8618bca54ae9c807adfb5301d6cc9ad870187194810f2e955

See more details on using hashes here.

File details

Details for the file trazo-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: trazo-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 52.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Hatch/1.16.5 cpython/3.14.4 HTTPX/0.28.1

File hashes

Hashes for trazo-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2acbfc01f667c7c9081ee80611e453096636914f43205989fe528156838a3704
MD5 9b554de3bc26079136c5cc0e532b8a99
BLAKE2b-256 b900d59bc16109c78dfe7c2dd62fd1e5d89aba792f8d4e81f083a30e3943003f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page