Skip to main content

Zero-dependency execution tracer and semantic diff engine for LLM agent pipelines

Project description

๐Ÿงต Trazo

Execution tracer and semantic diff engine for LLM agent pipelines.

Know exactly why your agent did what it did โ€” and how it changed.

CI PyPI version Python 3.10+ License: MIT Downloads Join Discord


The Problem

You're building an LLM pipeline. It worked yesterday. Today it's producing different answers, costing more, and you have no idea which call changed. You're staring at raw JSON logs and guessing.

Trazo fixes this.

Before Trazo:              After Trazo:
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€              โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
print(response)         โ†’      trazo view abc123
grep through logs       โ†’      trazo diff abc123 def456
re-run everything       โ†’      trazo replay abc123 --span xyz
open Datadog ($$$)      โ†’      trazo ui  (local, free, instant)

What Trazo Does

  • ๐Ÿ” Traces every function call in your pipeline with zero boilerplate
  • ๐Ÿ“Š Visualizes execution as a DAG โ€” see parent/child spans, durations, costs
  • ๐Ÿ”€ Semantically diffs two runs โ€” detects what changed and how much
  • โฎ๏ธ Replays any span with its exact original inputs (time-travel debugging)
  • ๐Ÿ’ฐ Tracks token counts and USD cost per span, per run
  • ๐Ÿ”’ Local-first โ€” all data stays on your machine, zero cloud dependencies
  • โšก Framework-agnostic โ€” works with OpenAI, Anthropic, raw HTTP, LangChain, any Python

Quickstart

Install

pip install trazo

# For the web UI:
pip install "trazo[ui]"

Instrument in 3 lines

import trazo as tz

tz.init()  # โ† once, at startup
tz.instrument_ollama() # โ† 100% local, no API keys!

@tz.trace  # โ† on any function
def call_llm(prompt: str) -> str:
    # Use ollama, openai, anthropic, or any custom client
    return ollama.generate(model="phi3", prompt=prompt)["response"]

with tz.run("my_pipeline"):
    result = call_llm("Explain transformers in one sentence")

See what happened

trazo view                    # list all runs
trazo view abc123             # inspect a specific run
trazo view abc123 --spans     # full span tree
trazo diff [id1] [id2]        # semantic diff between runs
trazo replay abc123           # re-execute with original inputs
trazo ui                      # open browser DAG viewer

Core Features

@trazo.trace โ€” Automatic instrumentation

Decorate any function โ€” sync or async โ€” to capture inputs, outputs, timing, and errors:

@trazo.trace
def retrieve_context(query: str, top_k: int = 5) -> list[str]:
    return vector_db.search(query, k=top_k)

@trazo.trace(name="llm.generate", tags={"tier": "primary"})
async def async_generate(messages: list[dict]) -> str:
    response = await openai_client.chat.completions.create(...)
    return response.choices[0].message.content

trazo.span() โ€” Fine-grained control

Use context managers for manual span control and LLM metadata injection:

with trazo.run("rag_pipeline") as r:
    r.tag("experiment", "prompt_v3")

    with trazo.span("retrieve", inputs={"query": q}) as s:
        docs = vector_db.search(q)
        s.set_output({"doc_count": len(docs)})

    with trazo.span("generate") as s:
        s.set_model("gpt-4o")
        s.set_tokens(tokens_in=1240, tokens_out=380)
        s.set_cost(0.00412)
        response = llm.generate(docs, q)

trazo diff โ€” Semantic diff between runs

$ trazo diff abc123 def456

Comparing run_v1 against run_v2
Overall similarity: 71.3% โ€” Similar with changes
Cost delta:    +$0.00234
Token delta:   +312
Latency delta: +480ms

โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Span                     โ”‚ Kind       โ”‚ Similarity โ”‚ Cost ฮ”   โ”‚ Token ฮ”  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ retrieve_context         โ”‚ โ‰ก identicalโ”‚ 99.2%      โ”‚ โ€”        โ”‚ โ€”        โ”‚
โ”‚ openai.chat/gpt-4o       โ”‚ โ‰  diverged โ”‚ 54.1%      โ”‚ +$0.0021 โ”‚ +289     โ”‚
โ”‚ extract_answer           โ”‚ โ‰ˆ similar  โ”‚ 78.3%      โ”‚ โ€”        โ”‚ โ€”        โ”‚
โ”‚ new_validation_step      โ”‚ + added    โ”‚ 0.0%       โ”‚ +$0.0002 โ”‚ +23      โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

trazo replay โ€” Time-travel debugging

Re-execute any span with its exact original inputs, with optional overrides:

# Replay with original inputs
trazo replay abc123def456

# Replay with a different model (A/B test)
trazo replay abc123def456 -o model=gpt-4o-mini

# Dry-run: print inputs without executing
trazo replay abc123def456 --dry-run

trazo ui โ€” Browser DAG viewer

trazo ui
# โ†’ http://localhost:7432

Visualize your full execution DAG with D3.js. Click any node to inspect inputs/outputs. Compare runs. Track cost trends over time.

๐Ÿฆ™ Ollama & OpenAI Auto-instrumentation

Zero code changes โ€” just call once at startup to automatically trace models with token counts, execution latency, and cost estimates:

import trazo as tz

tz.init()

# 100% Local, API-free tracing
tz.instrument_ollama()
response = ollama.chat(model="phi3", messages=[...])

# Or cloud providers
tz.instrument_openai()
response = client.chat.completions.create(model="gpt-4o", messages=[...])

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    Your Agent Code                              โ”‚
โ”‚   @trazo.trace  /  trazo.span()  /  trazo.aspan()                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                        โ”‚ emit TraceEvent (non-blocking)
                        โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              TraceCollector (singleton)                         โ”‚
โ”‚  Thread-safe queue โ†’ background flush worker                    โ”‚
โ”‚  Never blocks your agent's execution path                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ”‚ write
                    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              StorageEngine (SQLite, WAL mode)                   โ”‚
โ”‚  runs / spans / embeddings tables                               โ”‚
โ”‚  Zero external dependencies โ€” stdlib only                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€-โ”˜
        โ”‚                        โ”‚
        โ–ผ                        โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  CLI  (trazo)     โ”‚   โ”‚    Web UI  (FastAPI + D3.js)               โ”‚
โ”‚  trazo view       โ”‚   โ”‚    http://localhost:7432                    โ”‚
โ”‚  trazo diff       โ”‚   โ”‚    DAG viz ยท Span inspector ยท Diff panel   โ”‚
โ”‚  trazo replay     โ”‚   โ”‚                                            โ”‚
โ”‚  trazo export     โ”‚   โ”‚                                            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key design decisions:

Decision Rationale
SQLite storage Zero setup, works offline, WAL mode for concurrent access
ContextVar propagation Correct parent-child span linking across async boundaries
TF n-gram similarity Semantic diff without requiring an ML model
Background flush worker Tracing never blocks the critical path
Framework-agnostic Monkey-patch integrations are opt-in, not required

Installation Options

# Minimal (CLI + tracing, no web UI)
pip install trazo

# With web UI
pip install "trazo[ui]"

# With real semantic embeddings (better diff quality)
pip install "trazo[embeddings]"

# Everything
pip install "trazo[ui,embeddings]"

# Development
pip install "trazo[dev,ui]"
pre-commit install

CLI Reference

trazo view [RUN_ID] [--spans] [--limit N] [--db PATH]
trazo diff RUN_A RUN_B [--show-identical] [--db PATH]
trazo replay SPAN_ID [-o KEY=VALUE ...] [--dry-run] [--db PATH]
trazo export RUN_ID [--format json|html] [-o PATH] [--db PATH]
trazo clean [--older-than DAYS] [--keep N] [--run ID] [--all] [--yes]
trazo ui [--host HOST] [--port PORT] [--db PATH]

Supports short IDs โ€” you never need to type the full UUID.


Python API Reference

import trazo as tz

# Initialization
tz.init(db_path=None)
tz.instrument_ollama()
tz.instrument_openai()

# Tracing
@tz.trace                                 # sync decorator
@tz.trace(name="x", tags={"k": "v"})     # with options
async def fn(): ...                        # async supported automatically

# Context managers
with trazo.run("name", metadata={}) as r:    # top-level run
    r.tag("key", "value")

with trazo.span("name", inputs={}) as s:     # named span
    s.set_model("gpt-4o")
    s.set_tokens(100, 50)
    s.set_cost(0.00123)
    s.set_output({"result": ...})
    s.tag("key", "value")

async with trazo.aspan("name") as s:         # async span
    ...

# Inspection
trazo.get_current_span()                     # active Span | None
trazo.get_current_run()                      # active Run | None

Extending Trazo

Custom storage backend

from trazo.storage import StorageEngine
from trazo.collector import get_collector

# Use a custom database path
storage = StorageEngine(db_path="/data/my_project/traces.db")
get_collector().configure(storage)

Adding an integration

# Trazo/integrations/anthropic_patch.py
from trazo.tracer import _current_span, _current_run
from trazo.models import Span, SpanStatus

def patch_anthropic():
    import anthropic
    original_create = anthropic.resources.Messages.create
    def patched_create(self, *args, **kwargs):
        # ... same pattern as openai_patch.py
        pass
    anthropic.resources.Messages.create = patched_create

MCP (Model Context Protocol) server

# Expose your traces as an MCP tool
pip install "trazo[mcp]"   # coming in v0.2
trazo mcp-serve

Roadmap

  • Core tracing engine (@trazo.trace, trazo.span(), trazo.aspan())
  • SQLite storage with WAL mode
  • Semantic diff engine (n-gram TF similarity)
  • Time-travel replay
  • Rich terminal CLI (trazo view, trazo diff, trazo replay, trazo export)
  • Web UI with D3.js DAG visualization
  • OpenAI auto-instrumentation
  • CI: Python 3.10-3.12, Windows/macOS/Linux
  • Anthropic auto-instrumentation
  • Ollama auto-instrumentation
  • MCP server for Claude Desktop / Cursor integration
  • LangChain callback integration
  • Real semantic embeddings via sentence-transformers
  • GitHub Actions diff annotations (fail CI if similarity < threshold)
  • VS Code extension
  • trazo watch โ€” live terminal dashboard

Contributing

We welcome contributions! See CONTRIBUTING.md for details.

git clone https://github.com/Vikhram-S/trazo-dev
cd Trazo
pip install -e ".[dev,ui]"
pre-commit install
pytest tests/ -v

Good first issues: look for the good first issue label.


Why Trazo Will Reach 1,000 Stars

Reason Detail
Universal pain Every team building LLM apps hits the "why did this change" problem
30-second onboarding pip install + one decorator = full traces
No API key needed Natively supports Ollama so you can build and trace pipelines completely offline and for free
Visual demo hook The DAG viewer is screenshot-worthy and shareable
Zero lock-in SQLite, MIT license, no cloud, no vendor dependency
Framework agnostic Works with whatever stack you already use

License

MIT ยฉ 2026 Vikhram S


โญ Star on GitHub ยท ๐Ÿ“– Docs ยท ๐Ÿ’ฌ Discord ยท ๐Ÿ› Issues

Built with love for everyone debugging LLM agents at 2am.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trazo-0.1.1.tar.gz (54.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trazo-0.1.1-py3-none-any.whl (52.3 kB view details)

Uploaded Python 3

File details

Details for the file trazo-0.1.1.tar.gz.

File metadata

  • Download URL: trazo-0.1.1.tar.gz
  • Upload date:
  • Size: 54.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Hatch/1.16.5 cpython/3.14.4 HTTPX/0.28.1

File hashes

Hashes for trazo-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7ebb2cd71e853c7ceca2b6db5d2f9a8124b68eb22b255e34d97c7bd5ceb5ec11
MD5 9c18861e1494fafd5aecd42b41cbe379
BLAKE2b-256 2b18d53dffda551b2150d99255c1b993f018d86fc76595decf51e7054b9bc862

See more details on using hashes here.

File details

Details for the file trazo-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: trazo-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 52.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Hatch/1.16.5 cpython/3.14.4 HTTPX/0.28.1

File hashes

Hashes for trazo-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f0e908fec32cc8a9b216d9c98dfb245926429fb858092e5180e940ece4bce16e
MD5 d798046989b0c772ffe79c891c407af1
BLAKE2b-256 6aae6ed8d8257ab0ac39f870e8e33a8c5e3ee7f82edac50fbeb716b9a405c784

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page