Skip to main content

Coordination intelligence for AI. Detection, prevention, governance, and FinOps across single agents, agents calling tools, MCP servers, multi-agent orchestrators, RAG pipelines, and custom buses. CrewAI, LangGraph, custom-orchestrator, and Claude Code adapters.

Project description

AgentSonar

The coordination intelligence layer for AI. Detect, prevent, and optimize how AI agents work together — across any framework, in real time. The SDK ships eight detectors today (cyclic delegation, repetitive delegation, redundant tool calls, subagent explosion, stuck tool calls, failed-tool retry storms, context-window cliffs, runaway throughput), opt-in Prevent Mode that auto-blocks the host when a tracked failure trips, and adapters for CrewAI, LangGraph, custom Python orchestrators, and Claude Code. Two lines of code to integrate.

Install

pip install agentsonar               # custom orchestrators — no extras needed
pip install agentsonar[crewai]       # for CrewAI
pip install agentsonar[langgraph]    # for LangGraph / LangChain
pip install agentsonar[all]          # crewai + langgraph

Framework integrations are optional extras — install only what you need. The base install already supports custom orchestrators (hand-rolled Python loops, subprocess pipelines, Celery task DAGs, anything in between).

Try it in 5 seconds

Two equivalent ways — use whichever works for your setup:

agentsonar demo
# or
python -m agentsonar demo

Runs a bundled hello-world: three agents (Researcher → Writer → Reviewer) loop forever, Prevent Mode trips at rotation 5, the demo catches the PreventError and writes a self-contained HTML report you can open in your browser. No config, no API keys, no external dependencies. Useful both as a smoke test (confirms the install works) and as a 30-second walkthrough of what AgentSonar does.

If agentsonar demo says "command not found", use python -m agentsonar demo instead. The two are identical; the bare agentsonar form needs Python's Scripts/ (Windows) or bin/ (Linux/macOS) directory on your PATH, which isn't automatic when you're using a virtual environment you haven't activated, on Windows Store Python, or in some sandboxed installs. The python -m form sidesteps PATH entirely and works everywhere.

Usage

CrewAI

from agentsonar import AgentSonarListener

sonar = AgentSonarListener()
# ...run your crew normally. Detection happens automatically.

→ Full runnable example: CrewAI minimal setup.

LangGraph / LangChain

Two equivalent patterns — pick whichever fits your existing code better.

Pattern 1: monitor() wrapper (recommended if you already have callbacks)

from agentsonar import monitor

graph = monitor(graph)
result = graph.invoke(input)

monitor() wraps the compiled graph so every invoke/stream/ainvoke/ astream call auto-injects AgentSonar's callback into your config — without overriding any callbacks you pass. If you already have your own callbacks:

graph = monitor(graph)
result = graph.invoke(input, config={"callbacks": [my_cb]})
# Both callbacks run: [my_cb, AgentSonarCallback()]

Pattern 2: direct callback injection

from agentsonar import AgentSonarCallback

result = graph.invoke(
    input,
    config={"callbacks": [AgentSonarCallback()]},
)

Use this pattern when you want explicit control over callback order. You're responsible for merging AgentSonar with any existing callbacks yourself — the monitor() wrapper handles that automatically.

→ Full runnable example: LangGraph minimal setup.

Caveats (LangGraph adapter only):

  • Self-loops are captured as delegation edges. Since 0.4.3, a graph node that conditionally routes back to itself (node A → A → A …) produces A→A events that fire Layer 1 (rate limiter) and Layer 2 (edge anomaly) — the textbook "agent stuck in a retry loop" pattern. Layer 3 (cycle detector) doesn't fire on self-loops because a one-node loop isn't a multi-agent coordination cycle. Pre-0.4.3 these self-edges were filtered out entirely, hiding retry storms; if you relied on the silent behavior, filter at the user-code layer before graph.invoke().
  • Tool calls inside a node are captured (since 0.6.3). The LangGraph adapter hooks both on_chain_start (per-node) and on_tool_start / on_tool_end / on_tool_error (per-call). A node that internally calls 5 different tools produces 5 AgentSonar tool-edge events PLUS the node-level event, and the redundant-tool-call / stuck-tool-call / retry-storm detectors all fire on the per-call signal. Pass track_tool_calls=False to the constructor to restore pre-0.6.3 node-level-only behavior. on_llm_start is NOT hooked; sub-LLM-call detail is out of scope.

Custom orchestrator (hand-rolled loops, subprocess, Celery, etc.)

from agentsonar import monitor_orchestrator

sonar = monitor_orchestrator()

# Tell AgentSonar about each agent-to-agent handoff:
sonar.delegation(source="planner", target="researcher")
# ...run your agents normally...
sonar.delegation(source="researcher", target="reviewer")
# ...

sonar.shutdown()   # writes report.json + report.html

Use this when your orchestrator doesn't fit CrewAI's event bus or LangGraph's callback manager — a hand-rolled iteration loop, a Bash/subprocess pipeline, a Celery/FastAPI coordinator, or any custom glue. One .delegation() call per handoff; detection and reports are identical to the framework adapters.

→ Full runnable example: Custom orchestrator minimal setup.


That's the whole API. Zero config required to get started — no API keys, no accounts. Alerts stream to stderr as they fire and land in per-run log files on disk. See Output for everything AgentSonar writes and Configuration for the knobs you can tune.

Output

Every run creates its own session directory under agentsonar_logs/ in your working directory. All artifacts for a single run live together:

your_project/
└── agentsonar_logs/
    ├── .gitignore                                       # auto-written, contains *
    ├── latest                                            # plain text: current run dir name
    ├── run-2026-04-11_07-03-23-ancient-ember/            # most recent run
    │   ├── timeline.jsonl                                # every event, JSONL
    │   ├── alerts.log                                    # human-readable signal-only
    │   ├── report.json                                   # structured summary report
    │   └── report.html                                   # standalone HTML report
    └── run-2026-04-11_07-03-13-quiet-blossom/            # previous run
        └── ...

Live alerts also stream to stderr as they fire, prefixed with [SONAR HH:MM:SS.mmm].

Calling shutdown()

The JSON and HTML reports are generated on shutdown() — so how you end your run determines whether those two files get written.

LangGraph / LangChain — call shutdown() yourself when the run completes. Both wrapper patterns expose it:

# Pattern 1: monitor() wrapper — shutdown lives on the wrapper
graph = monitor(graph)
graph.invoke(input)
graph.shutdown()   # ← writes report.json + report.html, closes log files
# Pattern 2: direct AgentSonarCallback — shutdown lives on the callback
sonar = AgentSonarCallback()
graph.invoke(input, config={"callbacks": [sonar]})
sonar.shutdown()   # ← writes report.json + report.html, closes log files

If you forget to call it, timeline.jsonl and the stderr stream are still captured (they flush event-by-event), but the structured report.json / report.html summary files won't be generated for that run.

CrewAI — no teardown needed. AgentSonarListener hooks CrewKickoffCompletedEvent on the CrewAI event bus and runs shutdown() automatically when crew.kickoff() finishes. You get the full output set — including report.json and report.html — without any extra code.

What report.html looks like

The standalone HTML report (report.html) is a self-contained page — no external CSS or JavaScript, no network requests, safe to email, archive, or commit as a debugging artifact. Each coordination event renders as a card with its severity, failure class (with hover tooltip), summary, fingerprint, and expandable topology / thresholds blocks. Dark mode respects your system preference and persists across runs. Open it with your browser — or forward to a colleague and they'll see the same interactive view without any install steps.

Realtime tailing

timeline.jsonl is flushed on every event, so you can watch coordination problems land as they happen — useful when a long-running crew is behaving oddly and you want to catch the exact delegation that triggered an alert:

# Unix / macOS — tail the current run's timeline
tail -f "agentsonar_logs/$(cat agentsonar_logs/latest)/timeline.jsonl"

# Only the signal-only (alerts) view
tail -f "agentsonar_logs/$(cat agentsonar_logs/latest)/alerts.log"
# Windows PowerShell — -Wait is the tail-follow equivalent
Get-Content "agentsonar_logs/$(Get-Content agentsonar_logs/latest)/timeline.jsonl" -Wait

Each JSONL line is a self-contained record with ts, level, event, and a data payload — easy to pipe through jq or a custom parser if you want realtime dashboards during a run.

Clean-run signals

When the run completes without any WARNING or CRITICAL alerts, AgentSonar tells you explicitly in every channel instead of leaving you to wonder:

  • HTML report — a green "✓ No coordination failures detected" banner replaces the event cards.
  • timeline.jsonl — the final session_end record carries "clean_run": true, "warning_count": 0, "critical_count": 0, and a plain-English message field. Downstream parsers can gate on the single boolean instead of walking the whole alerts list.
  • stderr — a green ✓ No coordination failures detected — clean run. line prints right under the summary banner.

Non-clean runs flip clean_run to false and the message includes the severity breakdown (e.g. "2 CRITICAL, 5 WARNING coordination alert(s) detected during this run.").

Run naming, latest pointer, retention

Run directories are named run-<ISO date>_<time>-<adjective>-<noun> (e.g. run-2026-04-11_07-03-23-ancient-ember). The timestamp sorts chronologically; the slug is memorable enough to say out loud and is deterministic from the session id.

agentsonar_logs/latest is a plain text file pointing at the newest run directory name. Open the newest HTML report with:

# Unix / macOS
open "agentsonar_logs/$(cat agentsonar_logs/latest)/report.html"
# Windows PowerShell
Invoke-Item "agentsonar_logs/$(Get-Content agentsonar_logs/latest)/report.html"

AgentSonar keeps the 20 most recent runs by default and prunes older ones on every new session. Configure via AGENTSONAR_KEEP_RUNS or config={"keep_runs": N}; set to 0 to disable pruning.

The auto-written .gitignore inside agentsonar_logs/ contains *, so every log and report is git-ignored by default.

Opt-out

If you don't want the JSON/HTML reports, disable them in the config:

sonar = AgentSonarCallback(config={"auto_export_on_shutdown": False})

Manual export

If you want to export at multiple checkpoints during a run, or want the full timeline view (dedupe=False), call the exporters explicitly:

from agentsonar._output.json_export import export_json
from agentsonar._output.html_report import export_html

events = sonar.engine.get_recent_events()

# Summary view (default): one line per root cause
export_json(events)
export_html(events, title="Checkpoint 1")

# Timeline view: every state transition preserved
export_json(events, dedupe=False)

Configuration reference

Every entry point — monitor(), AgentSonarCallback(), and AgentSonarListener() — accepts an optional config dict to override the defaults. Pass only the keys you want to change; everything else keeps its default.

graph = monitor(graph, config={
    # Rate limits — raise these when you're running at demo speeds
    # (much faster than any real LLM workload) and don't want the
    # circuit breaker to trip before detection has a chance to see
    # the full pattern.
    "per_edge_limit": 99999,    # default 10    events per edge per window
    "global_limit":   99999,    # default 200   events total per window
    "window_size":    180.0,    # default 180.0 seconds

    # Alert severity thresholds — the rotation/event count AT which
    # WARNING and CRITICAL fire. Inclusive `>=` comparison: rotation 5
    # IS the trigger for `warning_threshold=5`, not "after 5 rotations
    # the 6th trips." Same convention as LangGraph's recursion_limit.
    # Lower for tight tests, raise for noisy production workloads.
    #
    # GENERIC (apply to every alert pattern unless a per-pattern key
    # below is also set):
    "warning_threshold":  5,    # default 5  (rotation 5 fires WARNING)
    "critical_threshold": 15,   # default 15 (rotation 15 fires CRITICAL)

    # PER-PATTERN OVERRIDES (added in 0.4.3, optional). Use these when
    # you want different thresholds for different alert patterns —
    # e.g. silence cycle alerts in a noisy prod workload while keeping
    # edge_anomaly tight, or vice versa. Leave a key unset (or None) to
    # fall back to the generic above.
    #
    # Setting `warning_threshold=999` to "disable cycles" is a common
    # mistake — it disables EVERY pattern uniformly. Use the
    # `cycle_*` keys instead to scope the override.
    "cycle_warning_threshold":          None,  # default: use warning_threshold
    "cycle_critical_threshold":         None,  # default: use critical_threshold
    "edge_anomaly_warning_threshold":   None,  # default: use warning_threshold
    "edge_anomaly_critical_threshold":  None,  # default: use critical_threshold

    # Output location and retention
    "log_dir":        ".",      # parent directory for agentsonar_logs/
    "keep_runs":      20,       # most recent run dirs to keep (0 = no pruning)
    "console_output": True,     # stream colored alerts to stderr
    "file_output":    True,     # write timeline.jsonl + alerts.log
    "auto_export_on_shutdown": True,  # write report.json + report.html
})

The same dict works for every entry point:

AgentSonarCallback(config={"per_edge_limit": 99999})
AgentSonarListener(config={"warning_threshold": 3})

The rate-limit knobs are the most common override. AgentSonar's default circuit breaker is tuned for real LLM workloads (a few hundred ms per call); scripted demos and unit tests fire events orders of magnitude faster, which trips the breaker before the downstream cycle/repetition detectors get a chance to fire. Bumping per_edge_limit and global_limit to a large value disables that short-circuit for local runs.

What gets detected

Every detected event carries a failure_class string that names the kind of problem. The classes AgentSonar currently surfaces:

  • cyclic_delegation — Agents are stuck in a loop. A delegates to B, B to C, C back to A. Usually means an exit condition is missing — a reviewer that never approves, a planner that always says "revise".

  • repetitive_delegation — One agent keeps calling another without making progress. A → B fires many times in a short window with no B → A return. Usually means A can't make a decision without B and isn't getting what it needs.

  • redundant_work — The same tool fires repeatedly with no material change in state between calls. Classic example: re-reading the same file 8 times without an edit in between. Often happens after an autocompact strips the file content from context but keeps the reference, so the model re-fetches what it already had.

  • subagent_explosion — Too many subagents have been spawned — either alive at the same time (concurrent), in too short a window (burst), or across too many distinct specialist types (cross-type). Each spawn loads fresh context; uncapped fan-out is the largest single token-burn pattern in production agent sessions.

  • agent_stall — A tool call started but never completed within the timeout. Most common cause: an MCP server hung, a Bash subprocess stuck, or an SSE stream dropped silently. The agent is blocked waiting for a response that will never arrive.

  • cascade_failure — A failed-tool retry storm. A tool keeps failing and the agent keeps retrying with the same input; each failed retry still consumes tokens, so the cycle silently burns cost until the operator notices. Cross-adapter: Claude Code, LangGraph, and the custom-orchestrator adapter all feed it.

  • token_velocity_anomaly — Cumulative context use has crossed a configured fraction of the model's window (warning at 50%, critical at 75% by default). Fires preventatively before Anthropic's autocompact hard-error patch would trigger, so the operator can /clear or scope down. Claude-only in v1.

  • resource_exhaustion — The system is processing events faster than it can sustain. Either one edge is being hammered or total throughput is over budget. Indicates a runaway agent or throttling that's too loose.

Reserved (enum slot only, no detector behind them yet): authority_violation, deadlock.

The HTML report shows a hover tooltip on every failure class badge with the same plain-English description.

Coordination fingerprint

Every detected failure carries a coordination_fingerprint like sha256:5c102a66e1104c47. It's a stable ID for the failure pattern: the same failure (same agents in the same shape) always produces the same fingerprint, regardless of when it's detected or how many times it re-escalates. You use it for:

  • Dedup. WARNING → CRITICAL escalations share a fingerprint, so the summary view collapses to one row at the highest severity.
  • Root-cause grouping. When a cycle is firing, the repetitive-edge alerts on each cycle edge are suppressed in the summary view.
  • Cross-run correlation. Grep any log file for the fingerprint to find every time the same pattern fired, across sessions.

You never compute fingerprints yourself — they're just stable IDs you can use to talk about "the same failure" across time.

Host safety

AgentSonar never crashes your app. If anything inside the SDK fails, detection degrades silently to a no-op and your crew / graph / API keeps running. The kill switch is AGENTSONAR_DISABLED=1 (also accepts true, yes, on, enabled, case-insensitive) — set it in the environment to disable AgentSonar without editing code. When running in degraded mode, get_summary() returns {"degraded": True, ...} so you can alert on it from your own dashboards.

Full minimal setup

Copy-pasteable starting points. The AgentSonar lines are marked — the rest is plain framework code. Both examples are self-contained and runnable as-is.

LangGraph minimal setup

# pip install agentsonar[langgraph]
import operator
from typing import Literal
from langchain_core.messages import AIMessage, AnyMessage, HumanMessage
from langgraph.graph import END, START, StateGraph
from typing_extensions import Annotated, TypedDict

from agentsonar import monitor                                    # ← AgentSonar

class State(TypedDict):
    messages: Annotated[list[AnyMessage], operator.add]
    iteration: int

def planner(s):    return {"messages": [AIMessage(content="plan")]}
def researcher(s): return {"messages": [AIMessage(content="research")]}
def reviewer(s):   return {"messages": [AIMessage(content="revise")],
                           "iteration": s.get("iteration", 0) + 1}
def loop(s) -> Literal["planner", "__end__"]:
    return END if s.get("iteration", 0) >= 8 else "planner"

b = StateGraph(State)
for name, fn in [("planner", planner), ("researcher", researcher), ("reviewer", reviewer)]:
    b.add_node(name, fn)
b.add_edge(START, "planner")
b.add_edge("planner", "researcher")
b.add_edge("researcher", "reviewer")
b.add_conditional_edges("reviewer", loop)

graph = monitor(b.compile())                                      # ← AgentSonar
graph.invoke({"messages": [HumanMessage(content="go")], "iteration": 0},
             config={"recursion_limit": 50})
graph.shutdown()                                                  # ← AgentSonar

CrewAI minimal setup

# pip install agentsonar[crewai]
# export OPENAI_API_KEY=sk-...
from crewai import Agent, Crew, Process, Task

from agentsonar import AgentSonarListener                         # ← AgentSonar
sonar = AgentSonarListener()                                      # ← AgentSonar

researcher = Agent(role="Researcher", goal="Gather info on the topic.",
                   backstory="Senior researcher.", allow_delegation=False)
writer     = Agent(role="Writer", goal="Write a short summary.",
                   backstory="Technical writer.", allow_delegation=False)
manager    = Agent(role="Manager", goal="Coordinate researcher and writer.",
                   backstory="Project manager.", allow_delegation=True)

task = Task(
    description="Write a 3-sentence summary of multi-agent coordination. "
                "Delegate research, then writing.",
    expected_output="A 3-sentence summary.",
    agent=manager,
)

# In hierarchical mode the manager goes ONLY in `manager_agent`,
# never in `agents`. Workers go in `agents`.
Crew(agents=[researcher, writer], tasks=[task],
     process=Process.hierarchical, manager_agent=manager).kickoff()
# AgentSonarListener auto-shuts down on CrewKickoffCompletedEvent.

Custom orchestrator minimal setup

# pip install agentsonar
from agentsonar import monitor_orchestrator                         # ← AgentSonar

sonar = monitor_orchestrator()                                      # ← AgentSonar

def run_agent(name, payload):
    # ...your agent logic (subprocess, API call, function, etc.)...
    return {"verdict": "continue"}

# A toy orchestrator loop — replace with your real glue.
pending = ["research"]
while pending:
    task = pending.pop(0)
    sonar.delegation(source="orchestrator", target="researcher")    # ← AgentSonar
    out = run_agent("researcher", task)
    if out["verdict"] == "continue":
        sonar.delegation(source="orchestrator", target="reviewer")  # ← AgentSonar
        verdict = run_agent("reviewer", out)
        if verdict.get("needs_revision"):
            pending.append(task)  # back to researcher; cycle risk

sonar.shutdown()                                                    # ← AgentSonar

monitor_orchestrator() returns a thin adapter with three methods: .delegation(source, target) records a handoff, .get_summary() gives you a dict of current alert counts, and .shutdown() writes the final reports. It's also a context manager (with monitor_orchestrator() as sonar: ...) if you prefer RAII-style cleanup.

After any of these scripts finishes, open the HTML report:

open "agentsonar_logs/$(cat agentsonar_logs/latest)/report.html"

Status

Closed beta. Schema, public API, and output formats are stable for design partners.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentsonar-0.6.9.tar.gz (534.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentsonar-0.6.9-py3-none-any.whl (231.5 kB view details)

Uploaded Python 3

File details

Details for the file agentsonar-0.6.9.tar.gz.

File metadata

  • Download URL: agentsonar-0.6.9.tar.gz
  • Upload date:
  • Size: 534.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for agentsonar-0.6.9.tar.gz
Algorithm Hash digest
SHA256 92dee0802cd577ade392adb7a117a04f80a403d1db03e74ddda404f4c42b5a7b
MD5 ed309648444ef3718b0b5305a8686e56
BLAKE2b-256 15140d65bebeb3f5dd5fccc1033ca34b27ec3730baa3e0e1ad0626461cfc2264

See more details on using hashes here.

File details

Details for the file agentsonar-0.6.9-py3-none-any.whl.

File metadata

  • Download URL: agentsonar-0.6.9-py3-none-any.whl
  • Upload date:
  • Size: 231.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for agentsonar-0.6.9-py3-none-any.whl
Algorithm Hash digest
SHA256 77f5669872d38f3b3f4fa2439f47da24e54cbd2308eceec4c03d3329499e8aa8
MD5 cf68fdd2bdfa9ac60997e23b152ab35c
BLAKE2b-256 58cd5225bf21634741adf5e9e116ec4e4916e866835e3239d537e8e061dcbcd1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page