Standalone agent health monitor — detect loops, stuck states, thrash, and runaway costs in any AI agent workflow.

These details have not been verified by PyPI

Project links

Project description

Agent Vitals

Standalone agent health monitor — detect loops, stuck states, thrash, and runaway costs in any AI agent workflow.

Agent Vitals watches your LLM agent's vital signs in real time. Feed it four numbers per step and it tells you when your agent is looping, stuck, thrashing, or burning tokens for nothing.

Install

pip install agent-vitals

# Optional framework integrations
pip install "agent-vitals[langchain,langgraph]"

# Optional observability export (OTLP)
pip install "agent-vitals[otlp]"

# Development and CI tooling (tests, coverage, lint/type checks)
pip install "agent-vitals[dev]"

Quick Start

from agent_vitals import AgentVitals

monitor = AgentVitals(mission_id="my-task")

for step in range(max_steps):
    result = call_llm(prompt)
    findings = extract_findings(result)

    snapshot = monitor.step(
        findings_count=len(findings),
        coverage_score=compute_coverage(findings),
        total_tokens=result.usage.total_tokens,
        error_count=error_tracker.count,
    )

    if snapshot.any_failure:
        print(f"Health issue at step {snapshot.loop_index}: "
              f"{snapshot.stuck_trigger or snapshot.loop_trigger}")
        break

Features

4-field minimum: Only findings_count, coverage_score, total_tokens, error_count required
Zero-config defaults: AgentVitals() works out of the box with tuned thresholds
Framework-agnostic: No dependency on LangChain, LangGraph, or any agent framework
Built-in adapters: LangChain, LangGraph, CrewAI, AutoGen/AG2, DSPy, Haystack, Langfuse, and LangSmith signal extraction
Immutable snapshots: Every step() returns a VitalsSnapshot with signals, metrics, and detection results
JSONL export: Auto-log every snapshot to structured JSONL files
OTLP export: Send metrics to Datadog, Grafana Cloud, or any OTLP backend
Backtest harness: Offline evaluation of recorded trajectories with P/R/F1 metrics
Context manager: with AgentVitals(...) as monitor: for clean resource management

Detection Modes

Detector	What it catches	Signal
Loop	Agent repeating actions without progress	Findings plateau over N steps
Stuck	Coverage stagnation despite continued work	Low DM + low CV on coverage
Thrash	Excessive errors indicating instability	Error count above threshold
Runaway Cost	Token burn with no output	Token spike with flat findings

Content-Based Loop Detection (v1.5.0)

When you pass output_text to monitor.step(), Agent Vitals computes content-level similarity to distinguish loops from stuck states:

snapshot = monitor.step(
    findings_count=5,
    coverage_score=0.6,
    total_tokens=12000,
    error_count=0,
    output_text="The agent's latest output text here...",
)

# New fields on VitalsSnapshot:
print(snapshot.output_similarity)    # 0.0–1.0 Jaccard similarity vs previous output
print(snapshot.output_fingerprint)   # SHA-256 hash for exact-match detection

High similarity (≥0.85): Confirms loop — agent is producing repetitive outputs
Low similarity with stagnant coverage: Confirms stuck — agent is producing varied but unproductive outputs
No output_text: Detection falls back to signal-level heuristics (fully backward-compatible)

API Overview

Manual Integration (Recommended)

from agent_vitals import AgentVitals

monitor = AgentVitals(mission_id="research-task")
snapshot = monitor.step(
    findings_count=5,
    coverage_score=0.6,
    total_tokens=12000,
    error_count=0,
)

print(snapshot.health_state)     # "healthy" | "warning" | "critical"
print(snapshot.any_failure)      # True if loop or stuck detected
print(snapshot.stuck_trigger)    # e.g. "coverage_stagnation", "burn_rate_anomaly"

Adapter Integration

from agent_vitals import AgentVitals
from agent_vitals.adapters import TelemetryAdapter

monitor = AgentVitals(mission_id="my-task", adapter=TelemetryAdapter())
snapshot = monitor.step_from_state({
    "cumulative_outputs": 5,
    "coverage_score": 0.6,
    "cumulative_tokens": 12000,
    "cumulative_errors": 0,
})

LangChain Adapter Integration

from agent_vitals import AgentVitals
from agent_vitals.adapters import LangChainAdapter

monitor = AgentVitals(mission_id="lc-agent", adapter=LangChainAdapter())
snapshot = monitor.step_from_state({
    "cumulative_outputs": 7,
    "coverage_score": 0.72,
    "llm_output": {"token_usage": {"prompt_tokens": 1200, "completion_tokens": 600, "total_tokens": 1800}},
    "cumulative_errors": 1,
    "intermediate_steps": [("search", "..."), ("summarize", "...")],
})

LangGraph Adapter Integration

from agent_vitals import AgentVitals
from agent_vitals.adapters import LangGraphAdapter

monitor = AgentVitals(mission_id="lg-agent", adapter=LangGraphAdapter())
snapshot = monitor.step_from_state({
    "findings": ["f1", "f2"],
    "sources_found": [{"url": "https://example.com/a"}],
    "mission_objectives": ["o1", "o2", "o3"],
    "covered_objectives": ["o1", "o2"],
    "total_tokens": 4200,
    "errors": [],
})

CrewAI Adapter Integration

from agent_vitals import AgentVitals
from agent_vitals.adapters import CrewAIAdapter

monitor = AgentVitals(mission_id="crewai-agent", adapter=CrewAIAdapter())
snapshot = monitor.step_from_state({
    "crew": {
        "usage_metrics": {"prompt_tokens": 300, "completion_tokens": 120, "total_tokens": 420},
        "tasks": [{"status": "completed"}, {"status": "failed"}, {"status": "completed"}],
    },
    "task_outputs": [{"result": "finding-a"}, {"result": "finding-b"}],
})

AutoGen / AG2 Adapter Integration

from agent_vitals import AgentVitals
from agent_vitals.adapters import AutoGenAdapter

monitor = AgentVitals(mission_id="autogen-agent", adapter=AutoGenAdapter())
snapshot = monitor.step_from_state({
    "usage_summary": {
        "agent_a": {"prompt_tokens": 90, "completion_tokens": 40, "total_tokens": 130},
        "agent_b": {"prompt_tokens": 70, "completion_tokens": 35, "total_tokens": 105},
    },
    "chat_messages": [{"role": "user"}, {"role": "assistant"}, {"role": "assistant"}],
    "total_turns": 6,
})

DSPy Adapter Integration

from agent_vitals import AgentVitals
from agent_vitals.adapters import DSPyAdapter

monitor = AgentVitals(mission_id="dspy-program", adapter=DSPyAdapter())
snapshot = monitor.step_from_state({
    "lm_usage": {
        "openai/gpt-4o-mini": {
            "prompt_tokens": 1200,
            "completion_tokens": 400,
            "total_tokens": 1600,
        },
    },
    "predictions": [{"answer": "Summary A"}, {"answer": "Analysis B"}],
    "modules_completed": 2,
    "modules_total": 3,
    "errors": [],
})

The DSPy adapter extracts tokens from lm_usage (preferred) or lm.history (fallback), findings from predictions or history outputs, and coverage from module completion state. No dspy dependency required.

Haystack Adapter Integration

from agent_vitals import AgentVitals
from agent_vitals.adapters import HaystackAdapter

monitor = AgentVitals(mission_id="haystack-agent", adapter=HaystackAdapter())
snapshot = monitor.step_from_state({
    "messages": [
        {"role": "user", "content": "Research quantum computing"},
        {
            "role": "assistant",
            "content": "Quantum error correction advances...",
            "_meta": {"usage": {"prompt_tokens": 200, "completion_tokens": 80, "total_tokens": 280}},
        },
    ],
    "state": {"coverage_score": 0.6},
    "sources": [
        {"url": "https://arxiv.org/paper1"},
        {"url": "https://nature.com/article1"},
    ],
})

The Haystack adapter handles both Agent state (messages with _meta.usage) and Pipeline state (component_outputs with replies). Extracts source URLs for domain counting. No haystack-ai dependency required.

Langfuse Adapter Integration

from agent_vitals import AgentVitals
from agent_vitals.adapters import LangfuseAdapter

monitor = AgentVitals(mission_id="langfuse-agent", adapter=LangfuseAdapter())
snapshot = monitor.step_from_state({
    "observations": [
        {
            "type": "GENERATION",
            "model": "gpt-4o",
            "output": "Analysis of market trends in Q4.",
            "usage": {"prompt_tokens": 500, "completion_tokens": 200, "total_tokens": 700},
            "level": "DEFAULT",
        },
        {
            "type": "SPAN",
            "name": "web_search",
            "output": {"results": ["result1", "result2"]},
        },
    ],
    "scores": [{"name": "coverage", "value": 0.65}],
    "sources": [
        {"url": "https://example.com/report"},
        {"url": "https://other.org/data"},
    ],
})

The Langfuse adapter extracts tokens from GENERATION observations (usage or usage_details), findings from unique generation outputs, errors from observation level ("ERROR") and status_message, and coverage from scores or trace metadata. Also accepts flat generations lists. No langfuse dependency required.

LangSmith Adapter Integration

from agent_vitals import AgentVitals
from agent_vitals.adapters import LangSmithAdapter

monitor = AgentVitals(mission_id="langsmith-agent", adapter=LangSmithAdapter())
snapshot = monitor.step_from_state({
    "run_type": "chain",
    "usage_metadata": {"input_tokens": 500, "output_tokens": 200, "total_tokens": 700},
    "outputs": {"output": "Analysis of market trends in Q4."},
    "child_runs": [
        {
            "run_type": "llm",
            "usage_metadata": {"input_tokens": 500, "output_tokens": 200, "total_tokens": 700},
            "outputs": {"output": "Generated analysis."},
        },
        {
            "run_type": "retriever",
            "outputs": {
                "documents": [
                    {"metadata": {"source": "https://example.com/report"}},
                ],
            },
        },
    ],
    "feedback_stats": {"coverage": {"mean": 0.65}},
    "status": "success",
})

The LangSmith adapter extracts tokens from usage_metadata (preferred) or LLM child_runs (fallback), findings from run outputs, errors from the error field and status, and coverage from feedback_stats or extra.metadata. Retriever child runs provide source/domain counts. No langsmith dependency required.

LangChain Callback Integration

from agent_vitals.callbacks import LangChainVitalsCallback

callback = LangChainVitalsCallback(
    mission_id="lc-callback",
    on_failure="log",            # "log" | "raise" | "callback"
    export_jsonl_dir="./vitals_logs",
)

# Pass callback into your LangChain runnable/agent callback list.

LangGraph Node Integration

from agent_vitals.callbacks import LangGraphVitalsNode

vitals_node = LangGraphVitalsNode(on_failure="force_finalize")

# Add `vitals_node` to your StateGraph as a normal callable node.
# Returned update includes:
#   - agent_vitals: snapshot payload
#   - force_finalize: True (when failure detected and mode is force_finalize)

Pre-built Signals

from agent_vitals import AgentVitals, RawSignals

monitor = AgentVitals(mission_id="my-task")
signals = RawSignals(findings_count=5, coverage_score=0.6, total_tokens=12000, error_count=0)
snapshot = monitor.step_from_signals(signals)

Export

Log every snapshot to JSONL for offline analysis or observability pipelines.

from agent_vitals import AgentVitals, JSONLExporter

exporter = JSONLExporter(
    directory="./vitals_logs",
    layout="per_run",       # or "append"
    max_bytes=10_000_000,   # rotation threshold (append mode)
)

with AgentVitals(mission_id="my-task", exporters=[exporter]) as monitor:
    for step in range(max_steps):
        monitor.step(findings_count=..., coverage_score=..., total_tokens=..., error_count=...)
# Exporter is automatically flushed and closed on exit

Layouts:

per_run: {directory}/{mission_id}/{run_id}.jsonl — one file per run
append: {directory}/{mission_id}.jsonl — all runs in one file, with rotation

OTLP Export (Datadog / Grafana / OTLP-compatible)

from agent_vitals import AgentVitals, OTLPExporter

otlp = OTLPExporter(
    endpoint="http://localhost:4318/v1/metrics",
    service_name="deepsearch-agent",
    mission_id="DRM.0.5",
    run_id="run-2026-02-09",
    workflow_type="research",
    export_interval_ms=5000,
)

with AgentVitals(mission_id="DRM.0.5", exporters=[otlp]) as monitor:
    monitor.step(findings_count=1, coverage_score=0.2, total_tokens=300, error_count=0)

Datadog example (delta temporality enabled):

from agent_vitals import OTLPExporter

datadog = OTLPExporter(
    endpoint="https://otlp.datadoghq.com/v1/metrics",
    headers={"DD-API-KEY": "<datadog_api_key>"},
    service_name="agent-vitals",
    mission_id="DRM.0.5",
    run_id="run-42",
    workflow_type="research",
    delta_temporality=True,
)

Grafana Cloud example:

from agent_vitals import OTLPExporter

grafana = OTLPExporter(
    endpoint="https://otlp-gateway-<region>.grafana.net/otlp/v1/metrics",
    headers={"Authorization": "Basic <base64(instance_id:api_key)>"},
    service_name="agent-vitals",
    mission_id="DRM.0.5",
    run_id="run-42",
    workflow_type="research",
)

Configuration

from agent_vitals import AgentVitals, VitalsConfig

# From constructor kwargs
monitor = AgentVitals(config=VitalsConfig(
    loop_consecutive_count=6,
    stuck_dm_threshold=0.15,
))

# From YAML file
monitor = AgentVitals.from_yaml("thresholds.yaml")

# From environment variables (VITALS_* prefix)
monitor = AgentVitals()  # auto-reads VITALS_LOOP_CONSECUTIVE_COUNT, etc.

Key Thresholds

Parameter	Default	Description
`loop_consecutive_count`	5	Steps of flat findings before loop detection
`stuck_dm_threshold`	0.15	DM below this → coverage stagnation
`stuck_cv_threshold`	0.5	CV below this → low variation
`burn_rate_multiplier`	2.0	Token spike ratio for burn rate anomaly

Framework-Specific Threshold Profiles (v1.5.0)

Different agent frameworks have different normal operating patterns. Framework profiles automatically tune detection thresholds when you use a built-in adapter:

from agent_vitals import AgentVitals
from agent_vitals.adapters import CrewAIAdapter

# Profile auto-detected from adapter type
monitor = AgentVitals(mission_id="crew-task", adapter=CrewAIAdapter())
# → Uses crewai profile: loop_consecutive_count=8, burn_rate_multiplier=4.0

Built-in profiles:

Framework	`loop_consecutive_count`	`burn_rate_multiplier`	Notes
langgraph	5	2.5	Tighter loop detection for graph-based workflows
crewai	8	4.0	Higher burn rate tolerance for multi-agent crews
dspy	10	—	Higher consecutive count for optimization loops

Override auto-detection with the framework parameter:

monitor = AgentVitals(
    mission_id="task",
    adapter=LangGraphAdapter(),
    framework="crewai",  # Override: use crewai profile instead
)

Define custom profiles in thresholds.yaml:

loop_consecutive_count: 6
profiles:
  langgraph:
    loop_consecutive_count: 5
    burn_rate_multiplier: 2.5
  crewai:
    loop_consecutive_count: 8
    burn_rate_multiplier: 4.0
    token_scale_factor: 0.7

Backtest

Evaluate detection accuracy against labeled trajectory corpora.

from agent_vitals.backtest import load_dataset, load_labels, run_backtest

dataset = load_dataset("path/to/traces/")
labels = load_labels("path/to/labels.json")
report = run_backtest(dataset, labels)

print(f"vitals.any: P={report.composite_any.precision:.3f} "
      f"R={report.composite_any.recall:.3f} "
      f"F1={report.composite_any.f1:.3f}")

for name, detector in report.detectors.items():
    print(f"  {name}: P={detector.precision:.3f} R={detector.recall:.3f}")

CI Coverage Gate

CI enforces coverage with pytest-cov:

Command: pytest --cov=agent_vitals --cov-report=xml --cov-fail-under=85
Baseline measured on 2026-02-09: 85% total coverage
Coverage XML artifact is uploaded in GitHub Actions (coverage.xml)

Session Summary

monitor = AgentVitals(mission_id="my-task")
# ... run steps ...
summary = monitor.summary()
# {"mission_id": "my-task", "total_steps": 8, "health_state": "healthy",
#  "any_loop_detected": False, "any_stuck_detected": False, ...}

monitor.reset()  # Clear history for next run (also flushes exporters)

Detection Precision

Agent Vitals has been validated against a 70-trace combined corpus spanning DeepSearch (LangGraph/Ollama) and cross-agent (LangChain, raw OpenAI, CrewAI, AutoGen, DSPy, Haystack — with GPT-4o-mini, DeepSeek-chat, Gemini, Llama, Mixtral, Claude, and local OSS models) trajectories.

Cross-Agent Corpus (40 traces, 7 frameworks, 7 models)

Detector	Precision	Recall	F1
vitals.any	1.000	1.000	1.000
loop	0.875	1.000	0.933
stuck	1.000	0.800	0.889
thrash	1.000	1.000	1.000

Combined Corpus (70 traces)

Detector	Precision	Recall	F1
vitals.any	1.000	0.982	0.991
loop	0.909	0.870	0.889
stuck	0.824	0.737	0.778
thrash	1.000	1.000	1.000

The composite vitals.any signal — used for enforcement decisions — maintains perfect precision across both corpora. Per-detector metrics are informational; the system correctly identifies failures even in the 2 edge cases where loop and stuck signals overlap. Content-based similarity (v1.5.0) addresses these edge cases for new traces that provide output_text.

See docs/vitals/av25-backtest-report.md for the latest backtest analysis.

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.19.0

Apr 10, 2026

1.18.0

Apr 10, 2026

1.17.0

Apr 10, 2026

1.16.0

Apr 10, 2026

1.15.0

Apr 9, 2026

1.14.2

Apr 8, 2026

1.14.1

Apr 8, 2026

1.14.0

Apr 7, 2026

1.13.1

Apr 7, 2026

This version

1.13.0

Apr 7, 2026

1.12.0

Apr 6, 2026

1.11.0

Mar 15, 2026

1.10.0

Mar 14, 2026

1.9.0

Feb 14, 2026

1.8.0

Feb 13, 2026

1.5.0

Feb 12, 2026

1.4.0

Feb 9, 2026

1.3.0

Feb 9, 2026

1.2.0

Feb 9, 2026

1.1.0

Feb 9, 2026

1.0.0

Feb 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_vitals-1.13.0.tar.gz (185.1 kB view details)

Uploaded Apr 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_vitals-1.13.0-py3-none-any.whl (140.0 kB view details)

Uploaded Apr 7, 2026 Python 3

File details

Details for the file agent_vitals-1.13.0.tar.gz.

File metadata

Download URL: agent_vitals-1.13.0.tar.gz
Upload date: Apr 7, 2026
Size: 185.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for agent_vitals-1.13.0.tar.gz
Algorithm	Hash digest
SHA256	`ac9464bbd5774f10841fcf01c88f7d0e64cee5e7193238e5055347bd9a7a7f02`
MD5	`b2d5328daef73efc6567edd365aa3b44`
BLAKE2b-256	`fbdce906a738ef5fb7039cf8138a54dffa432e170e1036e0fc1d39dfe7fc2778`

See more details on using hashes here.

File details

Details for the file agent_vitals-1.13.0-py3-none-any.whl.

File metadata

Download URL: agent_vitals-1.13.0-py3-none-any.whl
Upload date: Apr 7, 2026
Size: 140.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for agent_vitals-1.13.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0a9032941b9548937032ff3120c800fad394f5decea4c76b473961a085d9d16c`
MD5	`7e03045babb63ae026455d1c51661cb9`
BLAKE2b-256	`0234521ba48eb91f82e63ea22be61b0f3655a43c119e897df336d29bb96517cd`

See more details on using hashes here.

agent-vitals 1.13.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Agent Vitals

Install

Quick Start

Features

Detection Modes

Content-Based Loop Detection (v1.5.0)

API Overview

Manual Integration (Recommended)

Adapter Integration

LangChain Adapter Integration

LangGraph Adapter Integration

CrewAI Adapter Integration

AutoGen / AG2 Adapter Integration

DSPy Adapter Integration

Haystack Adapter Integration

Langfuse Adapter Integration

LangSmith Adapter Integration

LangChain Callback Integration

LangGraph Node Integration

Pre-built Signals

Export

OTLP Export (Datadog / Grafana / OTLP-compatible)

Configuration

Key Thresholds

Framework-Specific Threshold Profiles (v1.5.0)

Backtest

CI Coverage Gate

Session Summary

Detection Precision

Cross-Agent Corpus (40 traces, 7 frameworks, 7 models)

Combined Corpus (70 traces)

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes