Skip to main content

Multi-agent failure detection for production AI systems

Project description

Pisama

Find and fix failures in AI agent systems. No LLM calls required.

PyPI License: MIT Python 3.10+

Pisama detects 32 types of agent failures using heuristic detectors that run locally, with zero LLM cost. On the TRAIL benchmark, Pisama achieves 60.1% joint accuracy vs. 11% for the best frontier model — with 100% precision (zero false positives).

Install

pip install pisama

Usage

from pisama import analyze

result = analyze("trace.json")  # also accepts dicts and JSON strings

for issue in result.issues:
    print(f"[{issue.type}] {issue.summary} (severity: {issue.severity})")
    print(f"  Fix: {issue.recommendation}")

CLI

pisama analyze trace.json          # Analyze a trace
pisama watch python my_agent.py    # Watch a live agent (pip install pisama[auto])
pisama replay <trace-id>           # Re-run detection on stored traces
pisama smoke-test --last 50        # Batch test recent traces
pisama detectors                   # List all 18 detectors
pisama mcp-server                  # Start MCP server (pip install pisama[mcp])

MCP Server

Works in Cursor, Claude Desktop, Windsurf — no API key needed:

{
  "mcpServers": {
    "pisama": { "command": "pisama", "args": ["mcp-server"] }
  }
}

Detectors

Detector What It Catches
loop Infinite loops, retry storms, stuck patterns
coordination Deadlocked handoffs, message storms
hallucination Factual errors, fabricated tool results
injection Prompt injection, jailbreak attempts
corruption State corruption, type drift
persona Persona drift, role confusion
derailment Task deviation, goal drift
context Context neglect, ignored instructions
specification Output vs. requirement mismatch
communication Inter-agent message breakdown
decomposition Poor task breakdown, circular dependencies
workflow Unreachable nodes, missing error handling
completion Premature completion, unfinished work
withholding Suppressed findings, hidden errors
convergence Metric plateau, regression, thrashing
overflow Context window exhaustion
cost Token budget overrun
repetition Tool dominance, low diversity
routing Input sent to wrong specialist/route
propagation Silent error propagation across steps
critic_quality Rubber-stamping critics in reflection loops
escalation_loop Escalation loops without resolution
citation Fabricated citations
parallel_consistency Contradictory parallel results
memory_staleness Outdated memory retrieval
approval_bypass High-risk actions without approval
model_selection Wrong model for task complexity
mcp_protocol MCP tool/schema/auth failures
reasoning_consistency Contradictory reasoning, abandoned CoT
entity_confusion Entity mix-ups from context
task_starvation Planned tasks never executed
exploration_safety Risky actions in trial-and-error

Benchmark Results

TRAIL (trace-level failure detection, 148 traces):

Method Joint Accuracy Precision
Gemini 2.5 Pro 11.0% --
OpenAI o3 9.2% --
Pisama 60.1% 100%

Who&When (ICML 2025, multi-agent attribution):

Method Agent Accuracy Step Accuracy
o1 53.5% 14.2%
Pisama + Sonnet 4 60.3% 24.1%

Links

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pisama-0.2.0.tar.gz (32.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pisama-0.2.0-py3-none-any.whl (42.3 kB view details)

Uploaded Python 3

File details

Details for the file pisama-0.2.0.tar.gz.

File metadata

  • Download URL: pisama-0.2.0.tar.gz
  • Upload date:
  • Size: 32.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pisama-0.2.0.tar.gz
Algorithm Hash digest
SHA256 66b0ab43f3761f0e76cab6edbc1629fac1c768e9d7861c4ab61c014c6ce1ba7f
MD5 96e0c109f8afea1627f0b3cb2852232d
BLAKE2b-256 e2e49588932964ed063765bf1ad190708c51b42a1c0cbc82889ebc717aa49bb5

See more details on using hashes here.

File details

Details for the file pisama-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pisama-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 42.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pisama-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 55f96ddeb220665bd1e78601f771c5bbe032600135656de48ab3dd8e974f20d8
MD5 ec628452fad8ee4c415ef6dd51853078
BLAKE2b-256 c2665ee8ddce3fce180888e362201cf4897760ee416274544c1cd399e6da00c6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page