Framework-agnostic observability, audit, and eval for AI agent applications

These details have not been verified by PyPI

Project links

Project description

agent-observe

Framework-agnostic observability, audit, and eval for AI agent applications.

What is this?

agent-observe is a lightweight runtime layer that wraps your AI agent code to capture:

What tools were called and when
What LLM calls were made and how long they took
Policy violations (blocked operations)
Risk scores based on behavioral signals

Designed to be enterprise-safe by default - stores only metadata (hashes, sizes, timings), not raw content.

Installation

pip install agent-observe

# With PostgreSQL support
pip install agent-observe[postgres]

# With viewer UI
pip install agent-observe[viewer]

Quick Start

from agent_observe import observe, tool, model_call

# Initialize (zero-config)
observe.install()

# Wrap your tools
@tool(name="search", kind="http")
def search_web(query: str) -> list:
    return requests.get(f"https://api.search.com?q={query}").json()

# Wrap your LLM calls
@model_call(provider="openai", model="gpt-4")
def call_llm(prompt: str) -> str:
    return openai.chat.completions.create(...).choices[0].message.content

# Run your agent
with observe.run("my-agent", task={"goal": "Research AI"}):
    results = search_web("AI agents")
    analysis = call_llm(f"Analyze: {results}")

View results:

agent-observe view
# Open http://localhost:8765

Documentation

Document	Description
Examples	Runnable code examples (basic usage, async, policies)
Data Model	What are Runs, Spans, Events, and Replay Cache?
Capture Modes	What data is stored? Hashes vs full content
Configuration	Environment variables and Config options
Usage Guide	Policies, risk scoring, querying, real-world examples
Integration Guide	How to integrate with OpenAI, Anthropic, LangChain, etc.

Key Concepts

Runs, Spans, and Events

┌─────────────────────────────────────────────────────────────┐
│                        observe.run()                         │
│                           (Run)                              │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │
│  │ @tool       │  │ @model_call │  │ emit_event  │          │
│  │  (Span)     │  │   (Span)    │  │  (Event)    │          │
│  └─────────────┘  └─────────────┘  └─────────────┘          │
└─────────────────────────────────────────────────────────────┘

Run = One agent execution (start to finish)
Span = One tool or model call within a run
Event = Custom occurrence you emit

See Data Model for details.

Capture Modes

Mode	What's Stored	Use Case
`metadata_only`	Hashes, timings	Production (default)
`evidence_only`	Small content + hashes	Debugging
`full`	Everything	Development

Default is metadata_only - enterprise-safe, no PII leakage.

See Capture Modes for details.

Risk Scoring

Automatic risk scoring (0-100) based on:

Signal	Weight
Policy violations	+40
Tool success rate < 90%	+25
Repeated tool calls (loops)	+15
5+ retries	+10
Latency exceeds budget	+10

Configuration

Zero-Config (Recommended)

observe.install()  # Reads from environment variables

Environment Variables

AGENT_OBSERVE_MODE=metadata_only    # Capture mode
AGENT_OBSERVE_ENV=prod              # Environment
DATABASE_URL=postgresql://...       # Enables Postgres sink

See Configuration for all options.

Explicit Config

from agent_observe.config import Config, CaptureMode, SinkType

config = Config(
    mode=CaptureMode.FULL,
    sink_type=SinkType.POSTGRES,
    database_url=os.environ.get("DATABASE_URL"),
)
observe.install(config=config)

Sinks (Storage Backends)

Sink	Use Case
SQLite	Local development
PostgreSQL	Production
JSONL	Simple fallback
OTLP	OpenTelemetry export (Jaeger, Honeycomb, Datadog)

Auto-selected based on available connections.

Policy Engine

Create .riff/observe.policy.yml:

tools:
  allow:
    - "db.*"
    - "http.*"
  deny:
    - "shell.*"
    - "*.destructive"

limits:
  max_tool_calls: 100
  max_model_calls: 50

CLI

# Start viewer
agent-observe view

# Export to JSONL
agent-observe export-jsonl -o ./export

Architecture

agent_observe/
├── observe.py      # Core runtime
├── decorators.py   # @tool, @model_call
├── policy.py       # YAML policy engine
├── metrics.py      # Risk scoring
├── replay.py       # Tool result caching
├── sinks/          # Storage backends
└── viewer/         # FastAPI UI

Development

pip install -e ".[dev]"
pytest
ruff check .

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Jan 10, 2026

0.1.7

Jan 5, 2026

0.1.6

Jan 5, 2026

0.1.4

Jan 4, 2026

This version

0.1.3

Jan 4, 2026

0.1.2

Jan 4, 2026

0.1.1

Jan 4, 2026

0.1.0

Jan 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_observe-0.1.3.tar.gz (79.5 kB view details)

Uploaded Jan 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_observe-0.1.3-py3-none-any.whl (57.4 kB view details)

Uploaded Jan 4, 2026 Python 3

File details

Details for the file agent_observe-0.1.3.tar.gz.

File metadata

Download URL: agent_observe-0.1.3.tar.gz
Upload date: Jan 4, 2026
Size: 79.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.8.0 pkginfo/1.12.1.2 readme-renderer/44.0 requests/2.32.5 requests-toolbelt/1.0.0 urllib3/2.3.0 tqdm/4.67.1 importlib-metadata/8.5.0 keyring/25.6.0 rfc3986/1.5.0 colorama/0.4.6 CPython/3.12.9

File hashes

Hashes for agent_observe-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`b7e3d5a9bd9b6b61747c02123256aa99e5c1712ac97439b8cfec9572aed95603`
MD5	`ed0254966e36ee7d27a63339181a9582`
BLAKE2b-256	`67865bdecee311d042ec99f8c2bbb59646ce8fe25f84b61be2447dadbb401e5a`

See more details on using hashes here.

File details

Details for the file agent_observe-0.1.3-py3-none-any.whl.

File metadata

Download URL: agent_observe-0.1.3-py3-none-any.whl
Upload date: Jan 4, 2026
Size: 57.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.8.0 pkginfo/1.12.1.2 readme-renderer/44.0 requests/2.32.5 requests-toolbelt/1.0.0 urllib3/2.3.0 tqdm/4.67.1 importlib-metadata/8.5.0 keyring/25.6.0 rfc3986/1.5.0 colorama/0.4.6 CPython/3.12.9

File hashes

Hashes for agent_observe-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e4c751c5a635b52b2e4bb6db62ed6d8eda09d61e5d5a4f7b0c61c190eab9a74d`
MD5	`64be656e64daca30d9754e90f48bb9dd`
BLAKE2b-256	`e7aef57b6017ffe388e0e90614ef48c7acfccf68c4f48f9da0c79a8bbb292f50`

See more details on using hashes here.

agent-observe 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

agent-observe

What is this?

Installation

Quick Start

Documentation

Key Concepts

Runs, Spans, and Events

Capture Modes

Risk Scoring

Configuration

Zero-Config (Recommended)

Environment Variables

Explicit Config

Sinks (Storage Backends)

Policy Engine

CLI

Architecture

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes