Skip to main content

Detect coordination failures in multi-agent AI systems (CrewAI, LangGraph, and more)

Project description

AgentSonar

AgentSonar detects coordination failures in multi-agent AI systems — cycles, repetitive delegation, and runaway throughput — before they burn through your token budget. Two lines of code to integrate.

Install

pip install agentsonar[crewai]      # for CrewAI
pip install agentsonar[langgraph]   # for LangGraph / LangChain
pip install agentsonar[all]         # both

Framework integrations are optional extras — install only what you need.

Usage

CrewAI

from agentsonar import AgentSonarListener

sonar = AgentSonarListener()
# ...run your crew normally. Detection happens automatically.

→ Full runnable example: CrewAI minimal setup.

LangGraph / LangChain

Two equivalent patterns — pick whichever fits your existing code better.

Pattern 1: monitor() wrapper (recommended if you already have callbacks)

from agentsonar import monitor

graph = monitor(graph)
result = graph.invoke(input)

monitor() wraps the compiled graph so every invoke/stream/ainvoke/ astream call auto-injects AgentSonar's callback into your config — without overriding any callbacks you pass. If you already have your own callbacks:

graph = monitor(graph)
result = graph.invoke(input, config={"callbacks": [my_cb]})
# Both callbacks run: [my_cb, AgentSonarCallback()]

Pattern 2: direct callback injection

from agentsonar import AgentSonarCallback

result = graph.invoke(
    input,
    config={"callbacks": [AgentSonarCallback()]},
)

Use this pattern when you want explicit control over callback order. You're responsible for merging AgentSonar with any existing callbacks yourself — the monitor() wrapper handles that automatically.

→ Full runnable example: LangGraph minimal setup.


That's the whole API. Zero config required to get started — no API keys, no accounts. Alerts stream to stderr as they fire and land in per-run log files on disk. See Output for everything AgentSonar writes and Configuration for the knobs you can tune.

Output

Every run creates its own session directory under agentsonar_logs/ in your working directory. All artifacts for a single run live together:

your_project/
└── agentsonar_logs/
    ├── .gitignore                                       # auto-written, contains *
    ├── latest                                            # plain text: current run dir name
    ├── run-2026-04-11_07-03-23-ancient-ember/            # most recent run
    │   ├── timeline.jsonl                                # every event, JSONL
    │   ├── alerts.log                                    # human-readable signal-only
    │   ├── report.json                                   # structured summary report
    │   └── report.html                                   # standalone HTML report
    └── run-2026-04-11_07-03-13-quiet-blossom/            # previous run
        └── ...

Live alerts also stream to stderr as they fire, prefixed with [SONAR HH:MM:SS.mmm].

Calling shutdown()

The JSON and HTML reports are generated on shutdown() — so how you end your run determines whether those two files get written.

LangGraph / LangChain — call shutdown() yourself when the run completes. Both wrapper patterns expose it:

# Pattern 1: monitor() wrapper — shutdown lives on the wrapper
graph = monitor(graph)
graph.invoke(input)
graph.shutdown()   # ← writes report.json + report.html, closes log files
# Pattern 2: direct AgentSonarCallback — shutdown lives on the callback
sonar = AgentSonarCallback()
graph.invoke(input, config={"callbacks": [sonar]})
sonar.shutdown()   # ← writes report.json + report.html, closes log files

If you forget to call it, timeline.jsonl and the stderr stream are still captured (they flush event-by-event), but the structured report.json / report.html summary files won't be generated for that run.

CrewAI — no teardown needed. AgentSonarListener hooks CrewKickoffCompletedEvent on the CrewAI event bus and runs shutdown() automatically when crew.kickoff() finishes. You get the full output set — including report.json and report.html — without any extra code.

What report.html looks like

The standalone HTML report (report.html) is a self-contained page — no external CSS or JavaScript, no network requests, safe to email, archive, or commit as a debugging artifact. Each coordination event renders as a card with its severity, failure class (with hover tooltip), summary, fingerprint, and expandable topology / thresholds blocks. Dark mode respects your system preference and persists across runs. Open it with your browser — or forward to a colleague and they'll see the same interactive view without any install steps.

Realtime tailing

timeline.jsonl is flushed on every event, so you can watch coordination problems land as they happen — useful when a long-running crew is behaving oddly and you want to catch the exact delegation that triggered an alert:

# Unix / macOS — tail the current run's timeline
tail -f "agentsonar_logs/$(cat agentsonar_logs/latest)/timeline.jsonl"

# Only the signal-only (alerts) view
tail -f "agentsonar_logs/$(cat agentsonar_logs/latest)/alerts.log"
# Windows PowerShell — -Wait is the tail-follow equivalent
Get-Content "agentsonar_logs/$(Get-Content agentsonar_logs/latest)/timeline.jsonl" -Wait

Each JSONL line is a self-contained record with ts, level, event, and a data payload — easy to pipe through jq or a custom parser if you want realtime dashboards during a run.

Clean-run signals

When the run completes without any WARNING or CRITICAL alerts, AgentSonar tells you explicitly in every channel instead of leaving you to wonder:

  • HTML report — a green "✓ No coordination failures detected" banner replaces the event cards.
  • timeline.jsonl — the final session_end record carries "clean_run": true, "warning_count": 0, "critical_count": 0, and a plain-English message field. Downstream parsers can gate on the single boolean instead of walking the whole alerts list.
  • stderr — a green ✓ No coordination failures detected — clean run. line prints right under the summary banner.

Non-clean runs flip clean_run to false and the message includes the severity breakdown (e.g. "2 CRITICAL, 5 WARNING coordination alert(s) detected during this run.").

Run naming, latest pointer, retention

Run directories are named run-<ISO date>_<time>-<adjective>-<noun> (e.g. run-2026-04-11_07-03-23-ancient-ember). The timestamp sorts chronologically; the slug is memorable enough to say out loud and is deterministic from the session id.

agentsonar_logs/latest is a plain text file pointing at the newest run directory name. Open the newest HTML report with:

# Unix / macOS
open "agentsonar_logs/$(cat agentsonar_logs/latest)/report.html"
# Windows PowerShell
Invoke-Item "agentsonar_logs/$(Get-Content agentsonar_logs/latest)/report.html"

AgentSonar keeps the 20 most recent runs by default and prunes older ones on every new session. Configure via AGENTSONAR_KEEP_RUNS or config={"keep_runs": N}; set to 0 to disable pruning.

The auto-written .gitignore inside agentsonar_logs/ contains *, so every log and report is git-ignored by default.

Opt-out

If you don't want the JSON/HTML reports, disable them in the config:

sonar = AgentSonarCallback(config={"auto_export_on_shutdown": False})

Manual export

If you want to export at multiple checkpoints during a run, or want the full timeline view (dedupe=False), call the exporters explicitly:

from agentsonar._output.json_export import export_json
from agentsonar._output.html_report import export_html

events = sonar.engine.get_recent_events()

# Summary view (default): one line per root cause
export_json(events)
export_html(events, title="Checkpoint 1")

# Timeline view: every state transition preserved
export_json(events, dedupe=False)

Configuration reference

Every entry point — monitor(), AgentSonarCallback(), and AgentSonarListener() — accepts an optional config dict to override the defaults. Pass only the keys you want to change; everything else keeps its default.

graph = monitor(graph, config={
    # Rate limits — raise these when you're running at demo speeds
    # (much faster than any real LLM workload) and don't want the
    # circuit breaker to trip before detection has a chance to see
    # the full pattern.
    "per_edge_limit": 99999,    # default 10    events per edge per window
    "global_limit":   99999,    # default 200   events total per window
    "window_size":    180.0,    # default 180.0 seconds

    # Alert severity thresholds — how many rotations/events before
    # WARNING and CRITICAL fire. Lower them for tight tests, raise
    # them for noisy production workloads.
    "warning_threshold":  5,    # default 5
    "critical_threshold": 15,   # default 15

    # Output location and retention
    "log_dir":        ".",      # parent directory for agentsonar_logs/
    "keep_runs":      20,       # most recent run dirs to keep (0 = no pruning)
    "console_output": True,     # stream colored alerts to stderr
    "file_output":    True,     # write timeline.jsonl + alerts.log
    "auto_export_on_shutdown": True,  # write report.json + report.html
})

The same dict works for every entry point:

AgentSonarCallback(config={"per_edge_limit": 99999})
AgentSonarListener(config={"warning_threshold": 3})

The rate-limit knobs are the most common override. AgentSonar's default circuit breaker is tuned for real LLM workloads (a few hundred ms per call); scripted demos and unit tests fire events orders of magnitude faster, which trips the breaker before the downstream cycle/repetition detectors get a chance to fire. Bumping per_edge_limit and global_limit to a large value disables that short-circuit for local runs.

What gets detected

Every detected event carries a failure_class string that names the kind of problem. The classes AgentSonar currently surfaces:

  • cyclic_delegation — Agents are stuck in a loop. A delegates to B, B to C, C back to A. Usually means an exit condition is missing — a reviewer that never approves, a planner that always says "revise".

  • repetitive_delegation — One agent keeps calling another without making progress. A → B fires many times in a short window with no B → A return. Usually means A can't make a decision without B and isn't getting what it needs.

  • resource_exhaustion — The system is processing events faster than it can sustain. Either one edge is being hammered or total throughput is over budget. Indicates a runaway agent or throttling that's too loose.

Coming soon (reserved class names; no-op today): cascade_failure, authority_violation, deadlock, agent_stall, token_velocity_anomaly.

The HTML report shows a hover tooltip on every failure class badge with the same plain-English description.

Coordination fingerprint

Every detected failure carries a coordination_fingerprint like sha256:5c102a66e1104c47. It's a stable ID for the failure pattern: the same failure (same agents in the same shape) always produces the same fingerprint, regardless of when it's detected or how many times it re-escalates. You use it for:

  • Dedup. WARNING → CRITICAL escalations share a fingerprint, so the summary view collapses to one row at the highest severity.
  • Root-cause grouping. When a cycle is firing, the repetitive-edge alerts on each cycle edge are suppressed in the summary view.
  • Cross-run correlation. Grep any log file for the fingerprint to find every time the same pattern fired, across sessions.

You never compute fingerprints yourself — they're just stable IDs you can use to talk about "the same failure" across time.

Host safety

AgentSonar never crashes your app. If anything inside the SDK fails, detection degrades silently to a no-op and your crew / graph / API keeps running. The kill switch is AGENTSONAR_DISABLED=1 (also accepts true, yes, on, enabled, case-insensitive) — set it in the environment to disable AgentSonar without editing code. When running in degraded mode, get_summary() returns {"degraded": True, ...} so you can alert on it from your own dashboards.

Full minimal setup

Copy-pasteable starting points. The AgentSonar lines are marked — the rest is plain framework code. Both examples are self-contained and runnable as-is.

LangGraph minimal setup

# pip install agentsonar[langgraph]
import operator
from typing import Literal
from langchain_core.messages import AIMessage, AnyMessage, HumanMessage
from langgraph.graph import END, START, StateGraph
from typing_extensions import Annotated, TypedDict

from agentsonar import monitor                                    # ← AgentSonar

class State(TypedDict):
    messages: Annotated[list[AnyMessage], operator.add]
    iteration: int

def planner(s):    return {"messages": [AIMessage(content="plan")]}
def researcher(s): return {"messages": [AIMessage(content="research")]}
def reviewer(s):   return {"messages": [AIMessage(content="revise")],
                           "iteration": s.get("iteration", 0) + 1}
def loop(s) -> Literal["planner", "__end__"]:
    return END if s.get("iteration", 0) >= 8 else "planner"

b = StateGraph(State)
for name, fn in [("planner", planner), ("researcher", researcher), ("reviewer", reviewer)]:
    b.add_node(name, fn)
b.add_edge(START, "planner")
b.add_edge("planner", "researcher")
b.add_edge("researcher", "reviewer")
b.add_conditional_edges("reviewer", loop)

graph = monitor(b.compile())                                      # ← AgentSonar
graph.invoke({"messages": [HumanMessage(content="go")], "iteration": 0},
             config={"recursion_limit": 50})
graph.shutdown()                                                  # ← AgentSonar

CrewAI minimal setup

# pip install agentsonar[crewai]
# export OPENAI_API_KEY=sk-...
from crewai import Agent, Crew, Process, Task

from agentsonar import AgentSonarListener                         # ← AgentSonar
sonar = AgentSonarListener()                                      # ← AgentSonar

researcher = Agent(role="Researcher", goal="Gather info on the topic.",
                   backstory="Senior researcher.", allow_delegation=False)
writer     = Agent(role="Writer", goal="Write a short summary.",
                   backstory="Technical writer.", allow_delegation=False)
manager    = Agent(role="Manager", goal="Coordinate researcher and writer.",
                   backstory="Project manager.", allow_delegation=True)

task = Task(
    description="Write a 3-sentence summary of multi-agent coordination. "
                "Delegate research, then writing.",
    expected_output="A 3-sentence summary.",
    agent=manager,
)

# In hierarchical mode the manager goes ONLY in `manager_agent`,
# never in `agents`. Workers go in `agents`.
Crew(agents=[researcher, writer], tasks=[task],
     process=Process.hierarchical, manager_agent=manager).kickoff()
# AgentSonarListener auto-shuts down on CrewKickoffCompletedEvent.

After either script finishes, open the HTML report:

open "agentsonar_logs/$(cat agentsonar_logs/latest)/report.html"

Status

Closed beta. Schema, public API, and output formats are stable for design partners.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentsonar-0.1.4.tar.gz (228.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentsonar-0.1.4-py3-none-any.whl (87.6 kB view details)

Uploaded Python 3

File details

Details for the file agentsonar-0.1.4.tar.gz.

File metadata

  • Download URL: agentsonar-0.1.4.tar.gz
  • Upload date:
  • Size: 228.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for agentsonar-0.1.4.tar.gz
Algorithm Hash digest
SHA256 f6efc73c7f8e546c6be4172218a9c21ba20f6896f03f4e59e8ec556b230c2c02
MD5 37687f335b3302b0671ca0136e74b9d3
BLAKE2b-256 f7a8e6d04ff31d22f9b1192aa7748c11deac3c57682ea0686603229335b33997

See more details on using hashes here.

File details

Details for the file agentsonar-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: agentsonar-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 87.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for agentsonar-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 fbb8e844cdab378fa6ece1ff07bc3c559163c2b8998a87d5c72d2e4928961b8d
MD5 b3225b47d9c5823d3184bda6eea27e9f
BLAKE2b-256 9a4a9bc2148ae635a480d5db7b579559cf564d46af6860da41b0cf426039ad31

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page