Skip to main content

The OpenTelemetry for AI agents — structured traces, semantic diffs, and failure pattern mining

Project description

Agent Observatory 🔭

Overview

Developing agentic AI systems is fundamentally different from building traditional, deterministic software. When an agent fails, goes off track, or loops incessantly, it's often a "cognitive" failure happening inside a black box.

Developers are left struggling to answer:

  • Why did the agent choose this specific tool?
  • What was the exact prompt context and raw JSON output at step 5 before the failure?
  • If I tweak this system prompt, how does the agent's behavior path regression-test against the old version?

Relying on basic logs and print() statements to debug multi-step reasoning loops wastes tokens, time, and developer sanity.

The Solution

Agent Observatory is a production-grade cognitive debugging, tracing, and reliability platform specifically built for complex agentic AI systems.

It acts as an "X-ray" for your agents—providing zero-friction auto-instrumentation, intelligent failure mining, and a local real-time visual dashboard to bring structural correctness and transparency back into your development lifecycle, entirely offline and local.

Features

  • 🔋 Zero-Friction Auto-Instrumentation: Drop-in tracing support for major frameworks (OpenAI, Anthropic, CrewAI, Agno). Get full visibility into LLM I/O and tool usage without polluting your core business logic with telemetry code.
  • 🔬 Trace Diff Engine: Compare execution paths (structurally, not just textually) between two different agent runs. Definitively catch prompt regressions and verify whether a model change altered the agent's logical path.
  • 🕵️ Discriminative Failure Miner: Intelligent loop detection and semantic deduplication to actively catch when agents get stuck in infinite loops, recursive tool failures, or hallucination spirals.
  • 💻 Real-Time Local Dashboard: A clean, offline-first visualization dashboard (running on localhost:7421) that graphs hierarchical agent reasoning paths in real-time. Keep your proprietary prompts and logs safe without sending them to third-party SaaS tools.

Getting Started

1. Installation

Install the package directly into your Python environment:

pip install open-agent-observatory

2. Basic Usage (Universal Auto-Instrumentation)

Injecting the observatory requires only a fast, one-line configuration. It automatically detects and patches installed frameworks like Agno, OpenAI, or CrewAI.

Agno Example:

import agent_observatory as obs
# Automatically detects Agno and instruments the Agent and all tools globally!
obs.instrument()

from agno.agent import Agent
from agno.tools.yfinance import YFinanceTools

agent = Agent(
    name="Finance Agent",
    tools=[YFinanceTools()]
)

# From here, all cognitive steps, agent reasoning, and tool calls are traced
agent.print_response("What is NVDA trading at?")

OpenAI Example:

import agent_observatory as obs
obs.instrument()

import openai
response = openai.ChatCompletion.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Analyze my data..."}]
)

3. Launch the Local Dashboard

Start the real-time visualization server in the background while you build and test your agents:

python -m agent_observatory.cli.main serve --port 7421

(Proceed to http://localhost:7421 in your browser to inspect traces).

4. Trace Diffing (Regression Testing)

Ensure stability across prompt versions programmatically:

from agent_observatory.diff import engine

# Compare two trace runs structurally to ensure stable reasoning
diff_report = engine.compare_traces(run_id_feature_v1, run_id_feature_v2)

if diff_report.has_structural_changes:
    print(f"Warning: Agent logic has drifted! {diff_report.summary}")

Architecture Map

  • agent_observatory.auto — Drop-in instrumentation patches.
  • agent_observatory.core — The robust atomic event tracer.
  • agent_observatory.diff — The structural Trace Diff Engine.
  • agent_observatory.analytics — Failure Miner and loop detection algorithms.
  • agent_observatory.store — SQLite backed atomic transaction storage.
  • dashboard/ — The raw HTML/JS/CSS visualization layer.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_agent_observatory-0.1.1.tar.gz (65.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_agent_observatory-0.1.1-py3-none-any.whl (69.6 kB view details)

Uploaded Python 3

File details

Details for the file open_agent_observatory-0.1.1.tar.gz.

File metadata

  • Download URL: open_agent_observatory-0.1.1.tar.gz
  • Upload date:
  • Size: 65.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for open_agent_observatory-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ceae06dce7aa43525de9c56e50b55c44eafb0d10ec3ac067f244947fb06d099b
MD5 57da7101f136325163fdec7592d83583
BLAKE2b-256 56d49f1a818566af05eec06d3335100ebd0fc4bba99d86bff0aa0f9f6dfb2408

See more details on using hashes here.

File details

Details for the file open_agent_observatory-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for open_agent_observatory-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 190e8f0e475544c933d90872bcf1c2d61bf89f003656701a61de483342fce231
MD5 7180f78496a6e014fd39009cbceacf31
BLAKE2b-256 611228c1bdeb250c84ccf8f3abad969330fc0a0c760423fdec358a4c45836bfe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page