Skip to main content

Behavioral monitoring and directive control for AI agents

Project description

SOMA

SOMA

System of Oversight and Monitoring for Agents
The nervous system for AI agents.
Real-time behavioral monitoring. Predictive guidance. Autonomous safety control.

PyPI  Python  License  Tests

Claude Code LayerResearch PaperTechnical ReferenceUser GuideAPI ReferenceHook ReferenceRoadmap


Your AI agent just edited 5 files without reading any of them. It's retrying the same failing command for the 8th time. It wandered from your auth module into unrelated config files. And you have no idea until it's too late.

SOMA sees all of this in real-time — and steers the agent back on track.

pip install soma-ai

What SOMA Does

SOMA is not a dashboard. It's not a logger. It's a closed-loop behavioral guidance system that watches every action an AI agent takes, detects problems as they develop, and injects corrective feedback directly into the agent's context.

Watch → Guide → Warn → Block (only destructive ops)

What How
Watch 5 behavioral signals per action Uncertainty, drift, error rate, cost, token usage
Guide Injects specific advice into agent context "3 writes without a Read — Read the target file first"
Warn Escalating warnings as pressure rises Insistent guidance with increasing urgency
Block Blocks ONLY destructive operations rm -rf, git push --force, .env writes — never blocks normal tools
Learn Adapts thresholds to each agent Tracks intervention outcomes, tunes over time
Predict Warns ~5 actions before escalation Linear trend + pattern detection (error streaks, thrashing, blind writes)

What SOMA Catches

These are real messages SOMA injects into the agent's context:

[do] Read main.py and config.py before editing — 3 writes without a Read
[do] STOP retrying, try a different approach — 4 consecutive Bash failures
[do] Read the file, plan ALL changes, then make ONE edit — edited app.py 5x
[do] Start writing code — 7 reads, 0 writes in last 10 actions
[do] Verify you're still on track — 15 mutations with no user check-in
[predict] escalation in ~5 actions (error_streak) — stop retrying the failing approach
[scope]   scope expanded to tests/, config/ — is this intentional? If not, refocus
[quality] grade=D (2 syntax errors, 3/8 bash commands failed)
[✓] good — read before writing, clean edits

The agent reads these and changes its behavior. That's the feedback loop — not a human reading logs after the fact.


Quick Start

Claude Code (zero code)

uv tool install soma-ai
soma setup-claude

That's it. Phase-aware status line appears immediately:

SOMA: #42 [implement] ctx=73% focused

Python SDK (any agent)

import anthropic, soma

client = soma.wrap(anthropic.Anthropic())
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    messages=[...],
)
# Every API call is monitored

Why SOMA?

AI agents are powerful but fragile. They loop. They edit files blind. They retry failing commands endlessly. They drift from the task. And in multi-agent pipelines, one confused agent cascades failures across the entire system.

Existing solutions don't close the loop:

Approach Observes behavior? Tells the agent? Guides actions? Adapts? Multi-agent?
Guardrails (NeMo, Lakera) Prompt-level only No Content filter No No
Observability (LangSmith, Helicone) Yes No No No Partial
Rate limiters No No Token cap No No
SOMA 5 signals 7 pattern warnings 4-mode guidance Self-learning Trust graph

The Guidance System

SOMA doesn't just alert. It guides — progressively increasing urgency as pressure rises, but never blocking your normal workflow. SOMA is always present — even at low pressure it provides actionable metrics and positive feedback, not silence.

  0%          25%         50%           75%          budget=0
  │           │           │             │               │
  ▼           ▼           ▼             ▼               ▼
OBSERVE      GUIDE       WARN         BLOCK          SAFE_MODE
metrics    suggestions  insistent   destructive ops   budget gone
+ [✓]      never blocks never blocks only             read-only
Mode Pressure What SOMA Does
OBSERVE 0-24% All tools allowed. Status line shows vitals. Actionable metrics: ctx=73% focus=focused. Positive feedback: [✓] good — read before writing.
GUIDE 25-49% Soft suggestions injected into context. "Read before every Write/Edit." Never blocks anything. Workflow-aware — severity suppressed during planning phases.
WARN 50-74% Insistent warnings with increasing urgency. "Pressure is high — slow down and verify." Still never blocks normal tools.
BLOCK 75%+ Blocks ONLY destructive operations: rm -rf, git push --force, .env file writes. Write, Edit, Bash, Agent — all still work.
SAFE_MODE Budget gone Nothing runs until budget restored.

The key insight: agents respond to guidance. You don't need to block Edit to stop blind writes — you tell the agent to read first, and it does. Blocking normal tools just makes the agent less capable without making it safer.


Predictive Intervention

SOMA warns ~5 actions before problems happen:

Pattern Boost What It Tells the Agent
error_streak +15% "stop retrying the failing approach, try something different"
retry_storm +12% "investigate the root cause instead of retrying"
blind_writes +10% "Read the target files before editing"
thrashing +8% "plan the complete change first, then make one clean edit"

Read-context awareness eliminates false positives — edits after reads are not flagged as blind writes.

Linear trend extrapolation + pattern detection. Confidence-weighted — only warns when the data justifies it.


Self-Learning

Static thresholds produce false positives. SOMA eliminates them:

Escalation → wait 5 actions → pressure dropped?
                                 │
                    ┌────────────┴────────────┐
                    ▼                         ▼
               YES (helped)             NO (false positive)
            lower threshold             raise threshold
           (catch earlier)             (fewer false alarms)

Adaptive step size. Bounded +/-0.10 max shift. After ~15 interventions, SOMA converges to agent-specific thresholds.


Enterprise: Multi-Agent Systems

Running 5, 10, 50 agents? SOMA was built for this. Here's what it gives you that nothing else does:

The Problem at Scale

When a planning agent hallucinates requirements, the coding agent implements them faithfully, the testing agent burns cycles on hallucinated features, and the deployment agent ships it. By the time a human notices, you've burned hours and dollars. No one is watching the agents watch each other.

What SOMA Gives Enterprise Teams

Multi-Agent Pressure Propagation — when one agent spirals, downstream agents get warned before they inherit the chaos
from soma import SOMAEngine

engine = SOMAEngine()
engine.register_agent("planner")
engine.register_agent("coder")
engine.register_agent("reviewer")

# Trust graph: problems propagate downstream
engine.graph.add_edge("planner", "coder", trust=0.8)
engine.graph.add_edge("coder", "reviewer", trust=0.6)
  • Pressure flows along trust-weighted edges (damping: 0.60)
  • Trust decays 2.5x faster than it recovers — trust is easy to lose, hard to earn
  • When your planner spirals, the coder gets warned before the bad outputs arrive
  • No manual intervention needed — the graph handles it automatically

Without SOMA: planner hallucinates → coder implements garbage → reviewer wastes time → you find out an hour later. With SOMA: planner's pressure rises → coder's effective pressure rises → coder gets guided → pipeline self-corrects.

Agent Fingerprinting — catches behavioral shifts that simple monitoring misses

Persistent behavioral signature per agent:

  • Tool distribution (Read 45%, Edit 30%, Bash 15%, ...)
  • Error rate baseline
  • Read/write ratios
  • Session length norms

Jensen-Shannon divergence catches subtle distribution shifts. Your code-review agent suddenly doing 80% Bash? SOMA flags it instantly.

Use cases: prompt injection detection, model regression, unintended behavioral drift after config changes.

Root Cause Analysis — plain English diagnostics that agents can act on
"stuck in Edit→Bash→Edit loop on config.py (3 cycles)"
"error cascade: 4 consecutive Bash failures (error_rate=40%)"
"blind mutation: 5 writes without reading (foo.py, bar.py)"
"behavioral drift=0.25 driven by uncertainty=0.30"

5 detectors ranked by severity. These go directly into the agent's context — the agent self-corrects without human involvement.

Task Phase Detection — detects when agents wander off-task

SOMA infers current phase (research → implement → test → debug) and tracks file focus:

[scope] scope expanded to tests/, config/ — is this intentional? If not, refocus
[phase] switched from implement to debug — unexpected shift

Workflow-aware severity: warnings are suppressed during planning phases to avoid false positives when broad exploration is expected.

For enterprise: ensures each agent stays in its lane. A coding agent that starts "researching" unrelated files gets flagged.

Budget Management — per-agent limits with automatic SAFE_MODE
client = soma.wrap(client, budget={"tokens": 500_000, "cost_usd": 25.00})
  • Automatic SAFE_MODE when any budget dimension exhausted
  • Burn rate projection detects overspend trajectory early
  • Per-agent and per-pipeline tracking

A runaway agent hits its budget limit → SAFE_MODE → pipeline continues with other agents.

Why This Matters for Enterprise

Without SOMA With SOMA
Agent loops for 30 minutes before anyone notices Loop detected at iteration 3, agent guided to change approach
$500 API bill from a retry storm overnight Budget SAFE_MODE after $25, agent stops automatically
Planner hallucinates → entire pipeline builds garbage Planner's pressure propagates, coder gets warned before bad outputs arrive
Post-mortem: "the agent edited 47 files it shouldn't have" Real-time: "scope expanded to unrelated dirs — is this intentional?"
"Which agent caused the cascade failure?" RCA: "error cascade: 4 consecutive failures in coder (error_rate=40%)"

Claude Code Integration

SOMA is a native Claude Code extension — 4 lifecycle hooks, status line, and slash commands.

uv tool install soma-ai && soma setup-claude

Lifecycle Hooks

Hook When What It Does
PreToolUse Before tool execution Blocks destructive operations under high pressure
PostToolUse After tool completes Records action, validates code (py_compile + ruff), computes vitals
UserPromptSubmit Before agent reasons Injects pressure, predictions, RCA, and quality diagnostics
Stop Session ends Saves state, updates fingerprint, prints session summary

Status Line (always visible)

SOMA: #42 [implement] ctx=73% focused

Phase-aware header with actionable metrics — shows current task, phase, context usage, and focus state at a glance.

Slash Commands

Command Description
/soma:status Live pressure, quality, vitals, budget, tips
/soma:config View/change settings in-session
/soma:config mode strict Low thresholds, verbose, human-in-loop
/soma:config mode relaxed Balanced monitoring (default)
/soma:config mode autonomous Minimal monitoring for trusted runs
/soma:control reset Reset behavioral baseline
/soma:help Full command reference

CLI Commands

soma setup-claude    # Install hooks + slash commands into Claude Code
soma status          # Show current pressure, mode, quality
soma doctor          # Diagnose installation and configuration issues
soma reset           # Reset baselines to defaults
soma start           # Start SOMA monitoring
soma stop            # Stop SOMA monitoring
soma uninstall-claude # Remove SOMA hooks from Claude Code

Operating Modes

Mode Block At Approval Model Best For
strict 60% Human-in-the-loop Production, sensitive codebases
relaxed 80% Human-on-the-loop Daily development (default)
autonomous 95% No approvals Trusted CI/CD pipelines

Full details: Claude Code Layer deep-dive · Hook Reference


Dogfooding

SOMA monitors the agent that builds it. This README, the test suite, the banner, every commit — all produced by Claude Code under SOMA's watch.

Real observations from development sessions:

  • Blind writes caught: SOMA flagged when the agent edited files without reading them first — the agent stopped and read the file
  • Scope drift detected: Working on docs, the agent started touching CLI code — SOMA flagged it, agent refocused
  • Bash loops prevented: Agent retried a failing command — SOMA warned at attempt 2, the agent changed approach
  • Positive feedback works: Agent gets [✓] good — clean edits and maintains good habits through the session
  • Read-context aware: No false positives for edits after reads — SOMA knows the agent already read the file

The feedback loop works. The agent is measurably more careful when SOMA is watching.


Configuration

soma.toml in your project root — everything is tunable:

[hooks]
verbosity = "normal"      # minimal | normal | verbose
validate_python = true    # syntax check written Python files
lint_python = true        # ruff check after writes
predict = true            # predictive warnings
quality = true            # A-F quality grading

[budget]
tokens = 1_000_000
cost_usd = 50.0

[thresholds]              # pressure levels for mode transitions
guide = 0.25
warn = 0.50
block = 0.75

[weights]                 # signal importance in pressure
uncertainty = 2.0
drift = 1.8
error_rate = 1.5
cost = 1.0
token_usage = 0.8

The Math

No neural networks. No black boxes. Every formula is documented and tested.

Formula What It Does
P = 0.7·mean(wᵢpᵢ) + 0.3·max(pᵢ) Aggregate pressure — catches both gradual and acute failures
z = (x - μ) / max(σ, 0.05)sigmoid(z) Signal normalization — adapts to each agent's baseline
μₜ = 0.15·x + 0.85·μₜ₋₁ EMA baseline — half-life of ~4.3 observations
P̂ = P + slope·h + boost Prediction — linear trend + pattern boosts
Q = (w·Qw + b·Qb) · penalty Quality — write/bash success with syntax penalty

Complete derivations in Technical Reference. Theoretical foundations in Research Paper.


Test Results

568 tests. 0 failures. 0.70 seconds.

Every formula, threshold, edge case, and integration path is covered.

16 stress scenarios validate behavior under extreme conditions: rapid action sequences, budget exhaustion, pressure spikes, loop detection, and multi-agent propagation.

72KB of Claude Code integration tests simulate complete hook workflows end-to-end.

test_engine.py         ✓ Core pipeline
test_pressure.py       ✓ Z-score, sigmoid, aggregation
test_vitals.py         ✓ All 5 signals
test_baseline.py       ✓ EMA, cold-start
test_guidance.py       ✓ Mode transitions, blocking
test_learning.py       ✓ Threshold adaptation
test_predictor.py      ✓ Trend, patterns
test_quality.py        ✓ A-F grading
test_rca.py            ✓ Root cause analysis
test_fingerprint.py    ✓ JSD, divergence
test_graph.py          ✓ Multi-agent propagation
test_budget.py         ✓ Budget, SAFE_MODE
test_wrap.py           ✓ Anthropic + OpenAI
test_stress.py         ✓ 16 stress scenarios
test_claude_code_*.py  ✓ Full integration
test_hooks_*.py        ✓ All 4 hooks
test_cli.py            ✓ CLI + TUI
test_modes.py          ✓ Operating modes

Architecture

soma/
├── engine.py          Core pipeline — the brain
├── pressure.py        Pressure aggregation (weighted mean + max)
├── vitals.py          5 behavioral signal computations
├── baseline.py        EMA baselines with cold-start blending
├── guidance.py        4-mode guidance system (OBSERVE → GUIDE → WARN → BLOCK)
├── patterns.py        Behavioral pattern detection and [do] directive injection
├── findings.py        Structured findings with severity and context
├── context.py         Read-context tracking and phase-aware awareness
├── learning.py        Self-tuning threshold adaptation
├── predictor.py       5-action-ahead pressure prediction
├── quality.py         A-F code quality grading
├── rca.py             Root cause analysis (plain English)
├── task_tracker.py    Task phase and scope drift detection
├── fingerprint.py     Agent behavioral signatures (JSD)
├── graph.py           Multi-agent pressure propagation
├── budget.py          Multi-dimensional budget tracking
├── wrap.py            Universal client wrapper
├── hooks/             Claude Code lifecycle hooks
└── cli/               Terminal UI and commands

3 dependencies: rich + tomli-w + textual. Everything else is stdlib.


Documentation

Document What's Inside
:mortar_board: Research Paper Problem statement, biological/control-theory inspiration, formal models, evaluation, related work, 8 references
:triangular_ruler: Technical Reference Every formula with source file:line references, all constants, formal properties (boundedness, monotonicity, convergence)
:book: User Guide Setup, pressure model explained, baselines, learning, configuration, CLI commands, file paths
:wrench: API Reference Every class and method with code examples — SOMAEngine, Action, Mode, Budget, Predictor, Quality, Fingerprint
:robot: Claude Code Layer How SOMA integrates with Claude Code — what the agent sees, 7 patterns, code validation, operating modes, Claude's own perspective
:electric_plug: Hook Reference All 4 Claude Code hooks — input/output format, configurable features, silence conditions, examples
:world_map: Roadmap 6 milestones through 2027 — Foundation (done), Agent Intelligence (done), Real-World Ready, Ecosystem, Intelligence, Platform

Requirements

  • Python >= 3.11
  • Claude Code (for hook integration) — optional
  • ruff (for lint validation) — optional

No API keys. No accounts. No telemetry. No network requests.

License

MIT


Stop watching your agents fail. Start guiding them.

pip install soma-ai

Built for Claude Code by tr00x

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soma_ai-0.4.12.tar.gz (608.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

soma_ai-0.4.12-py3-none-any.whl (109.8 kB view details)

Uploaded Python 3

File details

Details for the file soma_ai-0.4.12.tar.gz.

File metadata

  • Download URL: soma_ai-0.4.12.tar.gz
  • Upload date:
  • Size: 608.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for soma_ai-0.4.12.tar.gz
Algorithm Hash digest
SHA256 0a1dee4118795d7f70c1be10295befd8e2b5674534e7005a9565af6280fb828c
MD5 1726c39fea1eb90a1f77c65913c99b47
BLAKE2b-256 06d8a11a6e98279fdd9a91666103054bad8fdb623614ff5c236469cca3c9cc31

See more details on using hashes here.

File details

Details for the file soma_ai-0.4.12-py3-none-any.whl.

File metadata

  • Download URL: soma_ai-0.4.12-py3-none-any.whl
  • Upload date:
  • Size: 109.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for soma_ai-0.4.12-py3-none-any.whl
Algorithm Hash digest
SHA256 370b352ed3b5df58d002db56edcaee64d96f87b4d6ed7f18ec3ef71a42dfb0e3
MD5 33bf1cb4f47e87b61c5e4cdb5fe67be6
BLAKE2b-256 92a64c307b79a7304244d8979184f3b94eb872ef8c07409894726346b03fab79

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page