The Reliability, Safety, and Observability Layer for AI Agents

These details have not been verified by PyPI

Project links

Project description

AgentWatch

Your AI agent is lying to you.
AgentWatch catches it — before it deletes your database.

The Crisis Nobody Is Talking About

1 in 20 AI agent requests fail in production right now — silently.

The system keeps running. The output looks correct. Nobody notices until a customer complains, a database is corrupted, or an audit finds something three weeks later.

76% of AI agent deployments fail within 90 days. Not because the models are bad. Because nobody can see what the agent is doing while it's doing it.

Gartner says 40% of enterprise apps will have AI agents by end of 2026. The same Gartner research says 40% of those projects will be cancelled by 2027 — specifically because of monitoring gaps.

The problem is not the model. The problem is that an agent that confidently fails is indistinguishable from an agent that correctly succeeds — unless you have a layer watching the reasoning, not just the output.

That layer didn't exist.

Until now.

What is AgentWatch?

AgentWatch is the first production observability layer built specifically for AI agent reasoning — not just outputs.

It sits between your agent and the world. It watches every action, scores every reasoning step with an independent model, blocks dangerous commands before they run, and gives you a full replay of exactly what happened and why.

57% of organizations run AI agents in production. Observability is the lowest-rated part of their stack. Current tools were built for single LLM calls — not multi-step agents that fail across 14 distinct failure modes.

AgentWatch was built for the agent era.

What It Does


🧠 Reasoning Auditor	Independent LLM scores every reasoning step — not just the output
🛡️ Safety Engine	Blocks dangerous commands before they execute
📊 Live Dashboard	Real-time trace of every action your agent takes
⏪ One-Click Rollback	Git-backed checkpoints at every step
💾 Persistent Memory	Cross-session episodic, semantic, and procedural memory
💰 Cost Tracker	Per-session token budget with live spend alerts
🔔 Alerting	Slack + PagerDuty when confidence drops or actions are blocked
📋 Compliance	GDPR/HIPAA audit exports, RBAC governance
🔌 Universal	Claude Code, LangChain, AutoGPT, OpenClaw — no rewrites

Quick Start

pip install agentwatch
docker compose up -d

Dashboard → http://localhost:3000 API → http://localhost:8000/docs

Supported Agents

AgentWatch wraps your existing agent. You change nothing.

Claude Code

agentwatch watch "Build me a REST API"

LangChain

from agentwatch.adapters.langchain import AgentWatchCallbackHandler

handler = AgentWatchCallbackHandler()
agent = AgentExecutor(agent=..., callbacks=[handler])

AutoGPT

from agentwatch.adapters.autogpt import AutoGPTAdapter

adapter = AutoGPTAdapter(session_id="session-1")
await adapter.on_action(action)

OpenClaw

from agentwatch.adapters.openclaw import OpenClawAdapter

adapter = OpenClawAdapter(session_id="session-1")
await adapter.on_skill_execution(skill_name, payload)

The Reasoning Auditor

This is what nobody else has built.

Every agent scores its own work. And it almost always thinks it did well — even when it didn't. The confidence is the problem, not the failure.

AgentWatch deploys an independent model — architecturally separate, no access to the agent's reasoning trace — whose only job is to find failure before the next action fires.

from agentwatch.reasoning.auditor import ReasoningAuditor

auditor = ReasoningAuditor()
result = await auditor.score_step(step)

print(result.confidence)          # 0.0 – 1.0
print(result.hallucination_risk)  # low / medium / high
print(result.goal_drift)          # True if agent is off-task

When confidence drops below your threshold, AgentWatch holds the next action and fires an alert. Not after the damage. Before it.

Safety Engine

from agentwatch.core.safety import SafetyEngine

engine = SafetyEngine()
result = await engine.check_event(event)

if result.is_blocked:
    print(f"Blocked: {result.safety.reasons}")
    print(f"Risk level: {result.safety.risk_level.value}")

Blocked by default: rm -rf /, curl | bash, disk formatting, credential exfiltration, and 40+ other critical patterns.

Rollback

agentwatch rollback <session-id> --to-step 12

Or click rollback in the dashboard. Every checkpoint is a full filesystem snapshot backed by git. Irreversible actions become reversible.

REST API

GET  /api/v1/sessions
GET  /api/v1/sessions/{id}/replay
GET  /api/v1/sessions/{id}/confidence
GET  /api/v1/sessions/{id}/checkpoints
POST /api/v1/sessions/{id}/rollback
GET  /api/v1/safety/blocked
GET  /api/v1/dashboard/summary
WS   /ws/events

Stack

Backend — FastAPI, PostgreSQL, Redis, Celery
Frontend — Next.js, Tailwind, Recharts, WebSockets
Infra — Docker Compose, GitHub Actions CI
Telemetry — OpenTelemetry compatible

Verified

✅ 47/47 tests passing
✅ docker compose up — zero errors
✅ API live at localhost:8000
✅ Dashboard live at localhost:3000
✅ Claude Code, LangChain, AutoGPT, OpenClaw adapters working

License

Apache 2.0

_{Built by sreerevanth · Issues → open one}

```

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentwatch_ai-0.1.0.tar.gz (187.5 kB view details)

Uploaded May 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentwatch_ai-0.1.0-py3-none-any.whl (58.2 kB view details)

Uploaded May 23, 2026 Python 3

File details

Details for the file agentwatch_ai-0.1.0.tar.gz.

File metadata

Download URL: agentwatch_ai-0.1.0.tar.gz
Upload date: May 23, 2026
Size: 187.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for agentwatch_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`201405dd3ba818c611965002c97221b4bc687acfd2351795b26459fba72feb06`
MD5	`4eff730e6ba00be5467ea99f10c62470`
BLAKE2b-256	`e54718365c11a71cb77bce501b788889e075bb1433bf25b7354c0652958474c7`

See more details on using hashes here.

File details

Details for the file agentwatch_ai-0.1.0-py3-none-any.whl.

File metadata

Download URL: agentwatch_ai-0.1.0-py3-none-any.whl
Upload date: May 23, 2026
Size: 58.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for agentwatch_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`50a60de49bbf1ff7287a38c170d6c61e369dab5187889b4d727e7bdfe07abe9d`
MD5	`21a711a3e3cc0db33dfd6a54fd3394ab`
BLAKE2b-256	`19aef19c7721e191f0eab57acc6c0ab4be0b17c23e022d570a1652c1576fbbe6`

See more details on using hashes here.

agentwatch-ai 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AgentWatch

The Crisis Nobody Is Talking About

What is AgentWatch?

What It Does

Quick Start

Supported Agents

Claude Code

LangChain

AutoGPT

OpenClaw

The Reasoning Auditor

Safety Engine

Rollback

REST API

Stack

Verified

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes