Skip to main content

Lightweight structured tracing for LLM applications

Project description

llmtrace

PyPI CI codecov

Lightweight structured tracing for LLM applications. Zero required dependencies beyond Pydantic. No backend needed.

Why llmtrace?

  • 3 lines to start — Configure, instrument, done. Traces flow immediately.
  • No backend required — Traces go to console, file, webhook, OTLP, Langfuse, Datadog, or anywhere.
  • Typed everything — Pydantic v2 models, full mypy --strict, IDE autocomplete on every field.
  • Zero-dep core — Only Pydantic required at runtime. Provider SDKs and sink dependencies are optional extras.

Installation

Requires Python 3.11+.

pip install llmtrace-sdk                    # core only (pydantic)
pip install llmtrace-sdk[anthropic]         # + Anthropic SDK
pip install llmtrace-sdk[openai]            # + OpenAI SDK
pip install llmtrace-sdk[webhook]           # + WebhookSink (httpx)
pip install llmtrace-sdk[otlp]             # + OTLP export (OpenTelemetry)
pip install llmtrace-sdk[otlp-grpc]        # + gRPC OTLP export
pip install llmtrace-sdk[langfuse]         # + Langfuse (uses OTLP)
pip install llmtrace-sdk[datadog]          # + Datadog (uses OTLP)
pip install llmtrace-sdk[presidio]         # + NLP-based PII detection
pip install llmtrace-sdk[all]              # everything

Combine extras as needed:

pip install llmtrace-sdk[anthropic,openai,webhook]
pip install llmtrace-sdk[anthropic,langfuse]
pip install llmtrace-sdk[openai,datadog,presidio]

Quick Start

import llmtrace

llmtrace.configure(sink="console")
llmtrace.instrument("anthropic")

That's it. Every Anthropic API call is now traced:

import anthropic

client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=200,
    messages=[{"role": "user", "content": "Explain monads in one sentence."}],
)

Console output:

[14:32:01] anthropic/claude-sonnet-4-20250514 | 1,243ms | 500→200 tokens | $0.0045 | ✓

Features

Feature Description
Auto-instrumentation Wrap provider SDKs with one call — no code changes to your LLM calls
Tool tracing @trace_tool decorator captures tool execution as part of the trace tree
Cost tracking Auto-computed per-call cost from built-in pricing registry
Typed trace events Every field is a Pydantic v2 model with full validation
Pluggable sinks Console, JSONL file, webhook, OTLP, Langfuse, Datadog, callback, or compose with MultiSink
Key redaction API keys, auth headers, and secrets stripped from traces by default
PII redaction Locale-aware pattern detection + optional Presidio NLP engine
OpenTelemetry-compatible IDs trace_id, span_id, parent_id follow OTel conventions for correct span trees
Span-based tracing Group related LLM calls with span() context manager
Error taxonomy Normalizes provider errors into categories: rate_limit, timeout, auth, etc.
Throughput metrics Auto-computed tokens/sec in trace metadata

The @trace Decorator

Trace any function that makes LLM calls. Works on both sync and async:

import llmtrace

@llmtrace.trace(provider="anthropic", tags={"team": "search"})
async def summarize(text: str) -> str:
    client = anthropic.AsyncAnthropic()
    response = await client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=500,
        messages=[{"role": "user", "content": f"Summarize: {text}"}],
    )
    return response.content[0].text

Nested @trace calls automatically form a parent-child hierarchy:

@llmtrace.trace()
async def inner() -> str:
    return await client.messages.create(...)

@llmtrace.trace()
async def outer() -> str:
    await inner()  # inner's parent_id → outer's span_id
    return await client.messages.create(...)

The @trace_tool Decorator

Trace tool/function executions that are called by your agent. Captures arguments, return values, latency, and errors:

import json
import llmtrace

@llmtrace.trace_tool(tags={"category": "weather"})
def get_weather(city: str) -> str:
    """Fetch current weather for a city."""
    return json.dumps({"city": city, "temp_c": 22, "condition": "sunny"})

@llmtrace.trace_tool(name="search_docs", tags={"category": "retrieval"})
async def search(query: str, limit: int = 10) -> list[str]:
    """Search the document index."""
    return await my_index.query(query, top_k=limit)

When called inside a @trace-decorated agent function, tool events automatically get the correct trace_id and parent_id:

@llmtrace.trace(provider="agent")
async def run_agent(query: str) -> str:
    response = await client.messages.create(...)  # LLM call, parent_id → agent's span_id
    result = get_weather("Paris")                 # tool call, parent_id → agent's span_id
    ...

You can also wrap a dict of tool functions in bulk:

tools = {"get_weather": get_weather_fn, "search": search_fn}
wrapped = llmtrace.instrument_tools(tools, tags={"source": "agent"})

Trace ID Conventions

llmtrace follows OpenTelemetry conventions for trace correlation:

Field Description
trace_id Shared by all events in the same trace. Created by the root span.
span_id Unique per event. Every TraceEvent gets its own span_id.
parent_id References the parent's span_id. None for root spans.

Example trace tree from an agent with tool use:

@trace("agent")              trace_id=T, span_id=A, parent_id=null
  ├── LLM call               trace_id=T, span_id=B, parent_id=A
  ├── tool: get_weather       trace_id=T, span_id=C, parent_id=A
  └── LLM call               trace_id=T, span_id=D, parent_id=A

This means traces export correctly to OTLP, Langfuse, and Datadog without broken span trees.

Sinks

ConsoleSink

Pretty-prints one-line summaries to stderr with ANSI colors.

from llmtrace.sinks import ConsoleSink

sink = ConsoleSink(colorize=True, verbose=False)

JsonFileSink

Writes JSONL with optional size-based rotation.

from llmtrace.sinks import JsonFileSink

sink = JsonFileSink("traces.jsonl", rotate_mb=50, rotate_count=5)

WebhookSink

Batched HTTP POST with retry and exponential backoff.

from llmtrace.sinks import WebhookSink

sink = WebhookSink(
    "https://my-endpoint.example.com/traces",
    headers={"Authorization": "Bearer tok_xxx"},
    batch_size=50,
)

MultiSink

Fan-out to multiple sinks simultaneously.

from llmtrace.sinks import ConsoleSink, JsonFileSink, MultiSink

sink = MultiSink([ConsoleSink(), JsonFileSink("traces.jsonl")])

CallbackSink

Call your own function for each event.

from llmtrace.sinks import CallbackSink

sink = CallbackSink(lambda event: my_database.insert(event.to_dict()))

OTLPSink

Export traces as OpenTelemetry spans to any OTLP-compliant backend.

from llmtrace.sinks import OTLPSink

sink = OTLPSink(
    endpoint="http://localhost:4318",
    service_name="my-app",
    capture_content=False,  # opt-in for request/response bodies
)

LangfuseSink

Pre-configured OTLP export to Langfuse.

from llmtrace.sinks import LangfuseSink

sink = LangfuseSink(
    public_key="pk-lf-...",
    secret_key="sk-lf-...",
    host="https://cloud.langfuse.com",  # EU cloud (default)
    # host="https://us.cloud.langfuse.com",  # US cloud
)

Or via environment variables: LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_HOST.

DatadogSink

Pre-configured OTLP export to Datadog.

Recommended: use the Datadog Agent. Datadog's direct OTLP traces intake (otlp-http-intake.*.datadoghq.com) is in preview and requires requesting access. The standard approach is to run a Datadog Agent that receives OTLP locally and forwards traces to Datadog.

from llmtrace.sinks import DatadogSink

# Via Datadog Agent (recommended) — agent runs locally, receives OTLP on port 4318
sink = DatadogSink(
    api_key="dd-...",
    endpoint="http://localhost:4318",
)

# Direct intake (requires preview access) — sends directly to Datadog
sink = DatadogSink(api_key="dd-...", site="us1")

Supported site shortcodes: us1, us3, us5, eu1, ap1, gov. These resolve to the corresponding otlp-http-intake.*.datadoghq.com URLs. Pass endpoint directly to use the Datadog Agent or any custom URL.

Or via environment variables: DD_API_KEY, DD_SITE.

Local development setup — see the examples/ directory for a Docker Compose setup with the Datadog Agent, OTLP collector, and Jaeger UI.

How it works in production

In a deployed environment, the Datadog Agent runs as a sidecar container (Kubernetes, ECS) or on each host (VMs). Your application sends OTLP traces to the agent at localhost:4318, and the agent forwards them to Datadog with enriched metadata (host tags, container info, etc.):

Your app  →  Datadog Agent (localhost:4318)  →  Datadog backend
              (adds host/container tags)

This is the same pattern used by all Datadog integrations — the agent handles buffering, retries, and authentication.

Provider Support

Provider Install extra Auto-traced methods
Anthropic [anthropic] messages.create (sync + async)
OpenAI [openai] chat.completions.create (sync + async)
Google [google] Coming soon
LiteLLM [litellm] Coming soon

Sensitive Key Redaction

By default, llmtrace strips sensitive keys (api_key, authorization, token, password, secret, credential, and variants) from request payloads in traces.

Default behavior (redaction on):

llmtrace.configure(sink="console")
# Request payload in traces:
# {"model": "claude-sonnet-4-20250514", "api_key": "[REDACTED]", "messages": [...]}

Disable redaction if you need full payloads (e.g., local debugging):

llmtrace.configure(sink="console", redact_sensitive_keys=False)
# Request payload in traces:
# {"model": "claude-sonnet-4-20250514", "api_key": "sk-ant-...", "messages": [...]}

You can also control whether request and response payloads are captured at all:

llmtrace.configure(
    sink="console",
    capture_request=False,   # traces will have request={}
    capture_response=False,  # traces will have response={}
)

Advanced Usage

Span-Based Tracing

Group related LLM calls into a span tree:

import llmtrace

async def agent_loop(query: str) -> str:
    async with llmtrace.span("agent_turn", tags={"query_type": "search"}) as turn:
        turn.annotate(user_query=query)

        plan = await plan_step(query)

        async with turn.child("tool_execution") as tool_span:
            tool_span.annotate(tool="web_search")
            results = await search(plan)

        async with turn.child("synthesis") as synth_span:
            answer = await synthesize(query, results)
            synth_span.annotate(answer_length=len(answer))

        return answer

PII Redaction

Three levels of PII protection, from zero-config to NLP-powered:

1. Automatic key redaction (on by default) — see Sensitive Key Redaction above.

2. Pattern-based redaction with locale support

from llmtrace.transform.enrichment import RedactPIIEnricher

enricher = RedactPIIEnricher(locales=("global", "intl", "eu"))
llmtrace.configure(enrichers=[enricher])

3. NLP-based redaction (names, addresses, medical terms)

enricher = RedactPIIEnricher(use_presidio=True, presidio_language="en")
llmtrace.configure(enrichers=[enricher])

Redaction strategies:

Strategy Example input Example output
REPLACE (default) john@example.com [EMAIL_REDACTED]
MASK john@example.com j***@*******.c*m
HASH john@example.com [SHA:a1b2c3d4]
from llmtrace.transform.enrichment import RedactPIIEnricher, RedactionStrategy

enricher = RedactPIIEnricher(strategy=RedactionStrategy.HASH)

Cost Tracking

Costs are auto-computed from a built-in pricing registry. Register custom models or override prices:

from decimal import Decimal
from llmtrace.pricing import ModelPricing, PricingRegistry

registry = PricingRegistry()

# Register a custom or new model
registry.register("anthropic", "claude-4-opus", ModelPricing(
    input_per_million=Decimal("15.00"),
    output_per_million=Decimal("75.00"),
))

llmtrace.configure(sink="console", pricing_registry=registry)

See the Known Limitations section for details on the built-in pricing data.

Cost and Latency Enrichers

Flag expensive calls:

from decimal import Decimal
from llmtrace.transform.enrichment import CostAlertEnricher

enricher = CostAlertEnricher(threshold_usd=Decimal("0.50"))
llmtrace.configure(enrichers=[enricher])
# Adds tags={"cost_alert": "high"} when a single call exceeds $0.50

Classify latency:

from llmtrace.transform.enrichment import LatencyClassifierEnricher

enricher = LatencyClassifierEnricher(fast_ms=500, normal_ms=2000, slow_ms=5000)
llmtrace.configure(enrichers=[enricher])
# Adds tags={"latency_class": "fast"|"normal"|"slow"|"critical"}

Custom Enrichers

An enricher is any callable that takes a TraceEvent and returns a TraceEvent:

from llmtrace.models import TraceEvent

class StripLongResponses:
    def __call__(self, event: TraceEvent) -> TraceEvent:
        if len(str(event.response)) > 10_000:
            return event.model_copy(update={"response": {"truncated": True}})
        return event

Environment Variable Configuration

Variable Type Default Description
LLMTRACE_SINK string console Sink: "console", "jsonfile:<path>", "webhook:<url>", "otlp", "otlp:<endpoint>", "langfuse", "datadog"
LLMTRACE_TAGS string (empty) Default tags: "k1=v1,k2=v2"
LLMTRACE_SAMPLE_RATE float 1.0 Trace sampling rate (0.0 to 1.0)
LLMTRACE_CAPTURE_REQUEST bool true Include full request payload in traces
LLMTRACE_CAPTURE_RESPONSE bool true Include full response payload in traces
LLMTRACE_REDACT_KEYS bool true Redact sensitive keys (api_key, auth headers) from payloads
LANGFUSE_PUBLIC_KEY string Langfuse public key (for sink="langfuse")
LANGFUSE_SECRET_KEY string Langfuse secret key (for sink="langfuse")
LANGFUSE_HOST string https://cloud.langfuse.com Langfuse host URL
DD_API_KEY string Datadog API key (for sink="datadog")
DD_SITE string us1 Datadog site identifier

Trace Event Schema

Every trace is a TraceEvent Pydantic model:

Field Type Description
trace_id UUID Shared trace identifier — same for all events in one trace
parent_id UUID | None Parent span's span_id. None for root spans
span_id UUID Unique identifier for this event
timestamp datetime UTC timestamp (auto-generated)
provider str Provider name: "anthropic", "openai", "tool", etc.
model str Model identifier (or tool name for tool events)
request dict Request payload (empty if capture_request=False)
response dict Response payload (empty if capture_response=False)
token_usage TokenUsage | None Prompt, completion, total, and cache token counts
cost Cost | None Input, output, and total cost as Decimal
latency_ms float Wall-clock latency in milliseconds
tool_calls list[ToolCallTrace] Tool calls extracted from the LLM response
error ErrorTrace | None Error type, message, retryability, stack trace
tags dict[str, str] String key-value tags for filtering
metadata dict[str, Any] Arbitrary metadata from enrichers or user code

Serialize with event.to_json() or event.to_dict().

Examples

The examples/ directory contains runnable scripts demonstrating common setups:

Example Description
quickstart.py Minimal setup — configure, instrument, call
agent_tracing.py Agent loop with @trace, @trace_tool, and auto parent-child hierarchy
production_setup.py MultiSink, PII redaction, cost alerts, latency classification
otlp_tracing.py OTLP export to a local collector
langfuse_tracing.py Langfuse integration
datadog_tracing.py Datadog integration
real_test_otlp_datadog.py End-to-end: real LLM call → OTLP collector + Jaeger + Datadog Agent

Each example supports both Anthropic and OpenAI — set whichever API key you have available.

Local tracing infrastructure

The examples/ directory includes a Docker Compose setup with an OTLP collector, Jaeger UI, and an optional Datadog Agent. The real_test_otlp_datadog.py script makes a real Anthropic API call with tool use and sends traces to all configured backends.

Architecture

Your app
  ├── OTLPSink  → localhost:4318 → OTLP Collector → Jaeger (UI on :16686)
  └── DatadogSink → localhost:4319 → Datadog Agent → Datadog cloud

The OTLP collector receives traces and fans them out to Jaeger (for local viewing) and a debug exporter (logs every span to stdout). The Datadog Agent is a separate service that receives OTLP on a different port and forwards traces to Datadog.

1. Start the infrastructure

cd examples

# Collector + Jaeger only
docker compose up -d

# With Datadog Agent (pass your API key)
DD_API_KEY=your-key docker compose --profile datadog up -d

The Datadog Agent is behind a datadog profile so it only starts when explicitly requested and doesn't fail when DD_API_KEY is unset.

2. Run the end-to-end test

export ANTHROPIC_API_KEY="sk-ant-..."
uv run python real_test_otlp_datadog.py

The script reads all configuration from environment variables:

Variable Default Description
ANTHROPIC_API_KEY (required) Anthropic API key
DD_API_KEY (empty, skips Datadog) Datadog API key
DD_SERVICE llmtrace-test Service name in Datadog/OTLP
DD_ENV development Environment tag
OTLP_ENDPOINT http://localhost:4318 OTLP collector endpoint
DD_AGENT_ENDPOINT http://localhost:4319 Datadog Agent OTLP endpoint

3. View traces

Service Port URL
Jaeger UI 16686 http://localhost:16686
OTLP collector (HTTP) 4318 http://localhost:4318
OTLP collector (gRPC) 4317 http://localhost:4317
Datadog Agent (OTLP) 4319 http://localhost:4319 (profile: datadog)
  • Jaeger: open http://localhost:16686, select the service name from the dropdown, click "Find Traces"
  • Datadog: open your Datadog dashboard → APM → Traces, filter by service:llmtrace-test
  • Collector debug logs: docker compose logs -f otel-collector — every span is logged with full attributes

4. Configuration files

The Docker setup uses two configuration files in examples/:

docker-compose.yaml — three services:

  • otel-collector — receives OTLP on :4318/:4317, exports to Jaeger and debug logs
  • jaeger — trace storage and UI on :16686
  • datadog-agent — receives OTLP on :4319, forwards to Datadog (profile: datadog)

otel-collector-config.yaml — collector pipeline:

  • Receives OTLP via gRPC and HTTP
  • Batches spans (5s timeout)
  • Exports to Jaeger (OTLP HTTP) and debug (stdout with full detail)

5. Stop

# Without Datadog
docker compose down

# With Datadog
docker compose --profile datadog down

Datadog: direct intake vs. agent

Datadog's direct OTLP traces intake (otlp-http-intake.*.datadoghq.com) is in preview and requires requesting access. The recommended approach is to run a Datadog Agent that receives OTLP locally and forwards to Datadog. This is the same pattern used in production — the agent runs as a sidecar (Kubernetes, ECS) or on each host (VMs) and handles buffering, retries, host tags, and authentication.

Known Limitations

Cost tracking uses a hard-coded pricing registry. The built-in registry covers popular models (Claude Sonnet/Opus/Haiku, GPT-4o/4o-mini, o1, o3-mini, Gemini 2.0 Flash/Pro) but prices may become stale as providers update them. For new or custom models, register pricing manually via PricingRegistry.register(). If a model is not in the registry, cost will be None rather than wrong.

Error taxonomy is regex-based with a fixed set of categories. Errors are classified into rate_limit, timeout, auth, context_length, content_filter, server, invalid_request, or unknown using pattern matching against the error message and type. Provider-specific error codes or nuanced categories are not yet supported — unrecognized errors fall through to unknown.

Google and LiteLLM providers are not yet instrumented. The extractor and instrumentor skeletons exist, but auto-instrumentation is not wired up. Contributions welcome.

No streaming support. Tracing currently captures complete request/response cycles. Streaming responses (SSE, async iterators) are not traced. The wrapper sees the initial call but not streamed chunks.

Contributing

git clone https://github.com/nablaux/llmtrace && cd llmtrace
uv sync --all-extras
uv run pre-commit install

Pre-commit hooks run automatically on every commit: ruff lint + format, yaml validation, trailing whitespace cleanup, private key detection, and mypy strict type checking.

To run checks manually:

uv run ruff check src/ tests/
uv run ruff format src/ tests/
uv run mypy src/llmtrace/ --strict
uv run pytest tests/ -v --cov=src/llmtrace --cov-report=term-missing

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmtrace_sdk-0.1.0.tar.gz (297.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmtrace_sdk-0.1.0-py3-none-any.whl (51.1 kB view details)

Uploaded Python 3

File details

Details for the file llmtrace_sdk-0.1.0.tar.gz.

File metadata

  • Download URL: llmtrace_sdk-0.1.0.tar.gz
  • Upload date:
  • Size: 297.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for llmtrace_sdk-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d407b632605c307f29d6c4a92e313e139f1dc3ebc540b1df8d249695542e18af
MD5 ef4cda13e8b7b04315a774cfea3955a0
BLAKE2b-256 e67f9d8431d46e82a4e98a69a80a503c385fbf2f5710cae41cce4f4a9c04bb11

See more details on using hashes here.

File details

Details for the file llmtrace_sdk-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llmtrace_sdk-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 51.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for llmtrace_sdk-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fcecbc13e27fb2888976727c21df22079ae9ed4472419a9928676437e7e383f2
MD5 750ed13ae8ccf0eafe0e6a0a33d40776
BLAKE2b-256 9aa2a4f7116d3cd8b9a2ac0f923b14300714676de43e80fa45537c8328baf758

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page