Skip to main content

Standalone agent health monitor — detect loops, stuck states, thrash, and runaway costs in any AI agent workflow.

Project description

Agent Vitals

PyPI version Python License: MIT

Standalone agent health monitor — detect loops, stuck states, thrash, and runaway costs in any AI agent workflow.

Agent Vitals watches your LLM agent's vital signs in real time. Feed it four numbers per step and it tells you when your agent is looping, stuck, thrashing, or burning tokens for nothing.

Install

pip install agent-vitals

Quick Start

from agent_vitals import AgentVitals

monitor = AgentVitals(mission_id="my-task")

for step in range(max_steps):
    result = call_llm(prompt)
    findings = extract_findings(result)

    snapshot = monitor.step(
        findings_count=len(findings),
        coverage_score=compute_coverage(findings),
        total_tokens=result.usage.total_tokens,
        error_count=error_tracker.count,
    )

    if snapshot.any_failure:
        print(f"Health issue at step {snapshot.loop_index}: "
              f"{snapshot.stuck_trigger or snapshot.loop_trigger}")
        break

Features

  • 4-field minimum: Only findings_count, coverage_score, total_tokens, error_count required
  • Zero-config defaults: AgentVitals() works out of the box with tuned thresholds
  • Framework-agnostic: No dependency on LangChain, LangGraph, or any agent framework
  • Immutable snapshots: Every step() returns a VitalsSnapshot with signals, metrics, and detection results
  • JSONL export: Auto-log every snapshot to structured JSONL files
  • Backtest harness: Offline evaluation of recorded trajectories with P/R/F1 metrics
  • Context manager: with AgentVitals(...) as monitor: for clean resource management

Detection Modes

Detector What it catches Signal
Loop Agent repeating actions without progress Findings plateau over N steps
Stuck Coverage stagnation despite continued work Low DM + low CV on coverage
Thrash Excessive errors indicating instability Error count above threshold
Runaway Cost Token burn with no output Token spike with flat findings

API Overview

Manual Integration (Recommended)

from agent_vitals import AgentVitals

monitor = AgentVitals(mission_id="research-task")
snapshot = monitor.step(
    findings_count=5,
    coverage_score=0.6,
    total_tokens=12000,
    error_count=0,
)

print(snapshot.health_state)     # "healthy" | "warning" | "critical"
print(snapshot.any_failure)      # True if loop or stuck detected
print(snapshot.stuck_trigger)    # e.g. "coverage_stagnation", "burn_rate_anomaly"

Adapter Integration

from agent_vitals import AgentVitals
from agent_vitals.adapters import TelemetryAdapter

monitor = AgentVitals(mission_id="my-task", adapter=TelemetryAdapter())
snapshot = monitor.step_from_state({
    "cumulative_outputs": 5,
    "coverage_score": 0.6,
    "cumulative_tokens": 12000,
    "cumulative_errors": 0,
})

Pre-built Signals

from agent_vitals import AgentVitals, RawSignals

monitor = AgentVitals(mission_id="my-task")
signals = RawSignals(findings_count=5, coverage_score=0.6, total_tokens=12000, error_count=0)
snapshot = monitor.step_from_signals(signals)

Export

Log every snapshot to JSONL for offline analysis or observability pipelines.

from agent_vitals import AgentVitals, JSONLExporter

exporter = JSONLExporter(
    directory="./vitals_logs",
    layout="per_run",       # or "append"
    max_bytes=10_000_000,   # rotation threshold (append mode)
)

with AgentVitals(mission_id="my-task", exporters=[exporter]) as monitor:
    for step in range(max_steps):
        monitor.step(findings_count=..., coverage_score=..., total_tokens=..., error_count=...)
# Exporter is automatically flushed and closed on exit

Layouts:

  • per_run: {directory}/{mission_id}/{run_id}.jsonl — one file per run
  • append: {directory}/{mission_id}.jsonl — all runs in one file, with rotation

Configuration

from agent_vitals import AgentVitals, VitalsConfig

# From constructor kwargs
monitor = AgentVitals(config=VitalsConfig(
    loop_consecutive_count=6,
    stuck_dm_threshold=0.15,
))

# From YAML file
monitor = AgentVitals.from_yaml("thresholds.yaml")

# From environment variables (VITALS_* prefix)
monitor = AgentVitals()  # auto-reads VITALS_LOOP_CONSECUTIVE_COUNT, etc.

Key Thresholds

Parameter Default Description
loop_consecutive_count 5 Steps of flat findings before loop detection
stuck_dm_threshold 0.15 DM below this → coverage stagnation
stuck_cv_threshold 0.5 CV below this → low variation
burn_rate_multiplier 2.0 Token spike ratio for burn rate anomaly

Backtest

Evaluate detection accuracy against labeled trajectory corpora.

from agent_vitals.backtest import load_dataset, load_labels, run_backtest

dataset = load_dataset("path/to/traces/")
labels = load_labels("path/to/labels.json")
report = run_backtest(dataset, labels)

print(f"vitals.any: P={report.composite_any.precision:.3f} "
      f"R={report.composite_any.recall:.3f} "
      f"F1={report.composite_any.f1:.3f}")

for name, detector in report.detectors.items():
    print(f"  {name}: P={detector.precision:.3f} R={detector.recall:.3f}")

Session Summary

monitor = AgentVitals(mission_id="my-task")
# ... run steps ...
summary = monitor.summary()
# {"mission_id": "my-task", "total_steps": 8, "health_state": "healthy",
#  "any_loop_detected": False, "any_stuck_detected": False, ...}

monitor.reset()  # Clear history for next run (also flushes exporters)

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_vitals-1.0.0.tar.gz (40.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_vitals-1.0.0-py3-none-any.whl (32.2 kB view details)

Uploaded Python 3

File details

Details for the file agent_vitals-1.0.0.tar.gz.

File metadata

  • Download URL: agent_vitals-1.0.0.tar.gz
  • Upload date:
  • Size: 40.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for agent_vitals-1.0.0.tar.gz
Algorithm Hash digest
SHA256 6b178676cdca19407d514e84ec150b8057cf72ce236a12c11eb98dbc91205dcc
MD5 1a6c6e0240d1dd4dcbc699f2ad0193f3
BLAKE2b-256 96b021370505426218df027df695d04621075d4b6b38b2a44cbbd9faf0319955

See more details on using hashes here.

File details

Details for the file agent_vitals-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: agent_vitals-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 32.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for agent_vitals-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b191cd25b7ce5304da13f9b4172baad58e432fd649258ea639120f947993a9c0
MD5 df05812f9d2f11e5f31b3b01b5d7a69d
BLAKE2b-256 19c1d40f06f3d6097f019e773ad6ddbc1c521c64f3e08120f25206e96afe9409

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page