Skip to main content

Behavioral observability for AI agents

Project description

Dunetrace SDK

Runtime observability for AI agents. Detects tool loops, context bloat, prompt injection, and 12 other failure patterns in real-time — with a Slack alert while the run is still live.

Zero external dependencies.

Install

pip install dunetrace                    # core SDK
pip install 'dunetrace[langchain]'       # + LangChain / LangGraph
pip install 'dunetrace[otel]'            # + OpenTelemetry exporter

Quickstart

LangChain / LangGraph

from dunetrace import Dunetrace
from dunetrace.integrations.langchain import DunetraceCallbackHandler

dt = Dunetrace()
callback = DunetraceCallbackHandler(dt, agent_id="my-agent")

result = agent.invoke(input, config={"callbacks": [callback]})
dt.shutdown()

Pure Python / custom agent — decorator style

from dunetrace import Dunetrace

dt = Dunetrace()

@dt.tool                                  # auto-emits tool.called / tool.responded
def web_search(query: str) -> list: ...   # args are SHA-256 hashed, never transmitted raw

@dt.trace                                 # agent_id defaults to "my_agent"
def my_agent(question: str) -> str:
    return web_search(question)[0]        # zero SDK calls needed inside function bodies

@dt.trace supports bare usage (@dt.trace with no parens), explicit agent ID (@dt.trace("research-agent")), and keyword args (@dt.trace(model="gpt-4o")). @dt.tool works on both sync and async functions and is a no-op when called outside a run context.

Or with @dt.agent + auto-instrumentation:

dt.init(agent_id="my-agent")   # patches openai, anthropic, httpx, requests globally

@dt.agent(model="gpt-4o")      # agent_id inherited from init()
def run_agent(query: str) -> str:
    return openai_client.chat.completions.create(...).choices[0].message.content

FastAPI / Flask — one line each, see docs/integrations.md.

What it detects

Detector What it catches Severity
TOOL_LOOP Same tool called 3+ times in a 5-call window HIGH
TOOL_THRASHING Agent alternates between exactly two tools HIGH
RETRY_STORM Same tool fails 3+ times in a row HIGH
LLM_TRUNCATION_LOOP finish_reason=length fires 2+ times HIGH
EMPTY_LLM_RESPONSE Zero-length output with finish_reason=stop HIGH
CASCADING_TOOL_FAILURE 3+ consecutive failures across 2+ distinct tools HIGH
SLOW_STEP Tool call >15s or LLM call >30s MEDIUM/HIGH
TOOL_AVOIDANCE Final answer without using available tools MEDIUM
GOAL_ABANDONMENT Tool use stops, then 4+ consecutive LLM calls with no exit MEDIUM
CONTEXT_BLOAT Prompt tokens grow 3× from first to last LLM call MEDIUM
STEP_COUNT_INFLATION Run used >2× the P75 step count for this agent MEDIUM
FIRST_STEP_FAILURE Error or empty output at step ≤2 MEDIUM
REASONING_STALL LLM:tool-call ratio ≥4× — reasoning without acting MEDIUM
RAG_EMPTY_RETRIEVAL Retrieval returned 0 results but agent answered anyway MEDIUM
PROMPT_INJECTION_SIGNAL Input matches known injection / jailbreak patterns CRITICAL

Output modes

Mode How to enable Destination
HTTP ingest (default) endpoint="http://…" Dunetrace backend → detection, alerts, dashboard
Loki NDJSON emit_as_json=True stdout → Promtail / Grafana Alloy
OpenTelemetry otel_exporter=DunetraceOTelExporter(provider) Tempo, Honeycomb, Datadog, Jaeger

Backend

git clone https://github.com/dunetrace/dunetrace
cd dunetrace && cp .env.example .env && docker compose up -d

Dashboard → http://localhost:3000 · Ingest → http://localhost:8001

Deploy markers

Annotate the detector timeline with release boundaries so you can correlate failure spikes with deploys:

# Call from your deploy script, CI/CD pipeline, or app startup
dt.mark_deploy("my-agent", version="v1.4.2", commit="abc1234", env="production")

The dashboard renders blue dashed vertical lines at each deploy timestamp on the 30-day detector rate chart. Fire-and-forget — runs on a background thread, never blocks the caller.

Additional keyword arguments are stored as meta and shown on hover.

Policies

Runtime guardrails that fire mid-run — before a failure propagates. Define conditions with any supported trigger and attach a stop, switch_model, inject_prompt, or log action.

from dunetrace import Dunetrace

dt = Dunetrace()

# Stop the run if tool call count exceeds 5
dt.add_policy(
    name="cap tool calls",
    condition={"trigger": "tool_call_count", "operator": "gt", "value": 5},
    action={"type": "stop"},
)

# Downgrade model when cost exceeds $0.50
dt.add_policy(
    name="cost cap",
    condition={"trigger": "cost_usd", "operator": "gt", "value": 0.50},
    action={"type": "switch_model", "params": {"model": "gpt-4o-mini"}},
)

# Inject a corrective prompt when a loop is detected
dt.add_policy(
    name="loop fix",
    condition={"trigger": "signal", "operator": "eq", "value": "TOOL_LOOP"},
    action={"type": "inject_prompt", "params": {"prompt": "Stop repeating tool calls. Summarise what you know and answer."}},
)

with dt.run("my-agent", user_input=query, tools=["search"]) as run:
    ...
    # After a stop policy fires, PolicyViolation is raised
    # After switch_model fires, check run.model_override
    # After inject_prompt fires, check run.pop_prompt_addition()

Policies can also be defined in the dashboard and fetched automatically at run start (60-second TTL cache per agent). See docs/integrations.md for the full reference.

Tests

python -m unittest discover -s tests -v

307 tests, no network required.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dunetrace-0.3.12.tar.gz (67.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dunetrace-0.3.12-py3-none-any.whl (52.4 kB view details)

Uploaded Python 3

File details

Details for the file dunetrace-0.3.12.tar.gz.

File metadata

  • Download URL: dunetrace-0.3.12.tar.gz
  • Upload date:
  • Size: 67.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dunetrace-0.3.12.tar.gz
Algorithm Hash digest
SHA256 53c0e53deb52af468094781d8aca9d6404e891393a165cc7c1fccfe41167d060
MD5 30c463e7731f2d03b6ca05aa7ed37bfa
BLAKE2b-256 a07c21b01f082d22350758f29235c72c9b96b4e2db6ad60e763767cf4da23862

See more details on using hashes here.

Provenance

The following attestation bundles were made for dunetrace-0.3.12.tar.gz:

Publisher: publish.yml on dunetrace/dunetrace

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dunetrace-0.3.12-py3-none-any.whl.

File metadata

  • Download URL: dunetrace-0.3.12-py3-none-any.whl
  • Upload date:
  • Size: 52.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dunetrace-0.3.12-py3-none-any.whl
Algorithm Hash digest
SHA256 141ca96b3f0128281a52be7d02bd520b3f991447d7aa43f374999fc4f9ed4e47
MD5 5cdc88e77fd93108569423763b601bf1
BLAKE2b-256 8b523f8fa547824c5da4853f2aa93e5c09212bc5b453178bd97c66ab850d511b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dunetrace-0.3.12-py3-none-any.whl:

Publisher: publish.yml on dunetrace/dunetrace

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page