Deterministic, framework-agnostic detector of multi-agent coordination pathologies that trips at iteration 2
Project description
looptrip
Deterministic, framework-agnostic detection of multi-agent coordination pathologies — caught at iteration 2, not on the invoice.
looptrip watches a multi-agent run as a stream of normalized events and flags the coordination pathologies that make agent systems burn money and spin: duplicate-work loops, ping-pong / livelock, deadlock, and non-termination. It is detection-first — it works over data you already have (OpenTelemetry GenAI spans, or a CAST cast.db) — and deterministic / zero-LLM: the same event stream always yields the same verdict. looptrip is an observer, never a gate; it reports, it never blocks.
This release (0.1.1) ships full pathology coverage (duplicate-work, ping-pong / livelock, deadlock, non-termination), configurable sensitivity controls, counterfactual-replay attribution (via the
attributesubcommand), and thecast.dbadapter with reproducible proof on real data — plus OpenTelemetry support (Phase 4): an offline adapter (OTelSpanAdapterfrom flat span dicts and OTLP/JSON exports) plus a liveLooptripSpanProcessorfor in-flight detection, available in thelooptrip[otel]extra. (0.1.0 was published before the Phase 4 code merged and shipped without the OTel modules; 0.1.1 is the first artifact to actually include them.)
The headline
On two real recorded multi-agent runaway sessions, a single workflow-subagent dispatch recurred 54 and 49 times with no progress between repeats. Tripping at the second dispatch — the first repeat — instead of letting the loop run to exhaustion would have saved:
| session | runaway loop | dispatches | trip point | saved |
|---|---|---|---|---|
2e6c0288 |
workflow-subagent |
54 | dispatch #2 | $320.16 |
da27b414 |
workflow-subagent |
49 | dispatch #2 | $472.80 |
| total | $792.96 |
Reproduce it yourself — no database required, the data is a committed fixture:
pip install -e .
looptrip proof
Why "iteration 2"
Native runaway guards are blunt total-step counters that trip at N=10–25 — after the waste has compounded. looptrip's trip is a safety predicate keyed on the pathology signature: no signature (agent, tool, args_hash) may recur without an intervening progress delta. The instant a signature is seen a second time (within a configurable input-token tolerance, with no progress marker between), it fires — before the third wasted turn and the O(N²) context-cost compounding. "2" is the default threshold, not a magic number. The approach (signature-keyed detection with configurable thresholds) is what matters — the detector itself is not the moat; the durable asset is standards engagement — adopting the upstream OpenTelemetry GenAI gen_ai.agent.handoff.* convention and contributing the agent-loop pathology layer (pending-wait and loop-termination semantics) that looptrip uniquely detects.
The worst real runaways are the hardest to catch: a workflow-subagent loop emits no structured handoff contract at all. So looptrip detects from the (agent, ts) repeat signal plus input-token variance alone; any handoff metadata only enriches the signal — it is never required.
Usage
looptrip proof # reproduce the $792.96 headline on the committed fixture
looptrip scan fixture:<session_id> # scan a session from the packaged fixture
looptrip scan cast-db:<session_id> # scan a live cast.db session (CAST hosts only)
looptrip scan --all fixture:<session_id> # run all four detectors (adds a 'kind' column)
looptrip attribute fixture:<session_id> # attribute pathologies to decisive handoffs (overdetermined = no single one)
looptrip --version
How it works
- One normalized event —
(agent, tool, args_hash, ts, handoff_state)plus optional cost/token metadata. An adapter maps each source's fields onto this schema, so detection logic never touches source-specific span-attribute renames. - Detection-first — Phase 1 ships a
cast.dbadapter. Phase 4 (now shipped) adds an offline OTel adapter (OTelSpanAdapteringesting flat span dicts and OTLP/JSON/JSONL exports) and a liveLooptripSpanProcessorfor in-flight pathology detection in thelooptrip[otel]extra. Becauseagent_runscarries no per-dispatch args, the adapter setsargs_hash=Noneand detection leans on the token-variance signal. - Stdlib state machine — the detector groups events by signature and trips on the 2nd same-signature occurrence with no progress delta. The core is stdlib-only; OpenTelemetry is an optional
[otel]extra, never imported by the detector. - False-positive control is first-class — a configurable input-token tolerance, a progress-delta marker, and an
idempotent_agentsallowlist keep legitimately-repeatable work (commits, reviews) from tripping. looptrip is meant to be run detect-then-print and dogfooded before any signal is trusted.
Honest framing
This project tries hard not to oversell:
- Attribution numbers. Published LLM-prompting baselines for "which handoff broke the run" sit around ~14% — but that is the prompting baseline; structured / deterministic methods reach 29–52%. Adding structure is the lever, and looptrip's deterministic replay (Phase 3) is the limit case of that frontier — not a fix for a permanent ceiling. We don't anchor to "14%."
- Cost numbers. The $792.96 here is verifiable from the committed fixture. Larger figures circulate — e.g. a widely-reported "$47K" agent-loop bill — but those are unverified, and we label them as such.
- Prior art. The market gap is real, but the durable asset is the standard, not the ~200-line detector. A direct competitor exists — Watchtower (MIT, LangGraph-only, trips at 3+ repeats, no handoff contract, no attribution). looptrip differentiates on framework-agnosticism, speed, and standards engagement with the OpenTelemetry GenAI agent-observability conventions — adopting the upstream
gen_ai.agent.handoff.*handoff identity and contributing the agent-loop pathology layer it uniquely detects.
Roadmap
- Phase 1 — (SHIPPED)
cast.dbadapter + duplicate-work / iteration-2 detector + reproducible proof. - Phase 2 — (SHIPPED) full pathology coverage (ping-pong / livelock, deadlock, non-termination) + sensitivity controls.
- Phase 3 — (SHIPPED) counterfactual replay attribution ("which handoff was decisive").
- Phase 4 — (SHIPPED) OpenTelemetry support: offline adapter (
OTelSpanAdapteringesting flat span dicts, OTLP/JSON, JSONL exports) and liveLooptripSpanProcessor(in-flight pathology detection viaon_starthooks, thread-safe, deduped) in thelooptrip[otel]extra. Unit and synthetic testing complete; production multi-agent validation pending. - Phase 5 — (SHIPPED) packaging (Claude Code plugin, Homebrew).
- Phase 6 — (SHIPPED) documentation (reference deep-dives, examples, architecture notes).
- Phase 7 — (repo work merged; upstream engagement in progress) OpenTelemetry GenAI agent-observability semantic-convention engagement: adopt the upstream
gen_ai.agent.handoff.*handoff identity (semantic-conventions-genai) and contribute the pathology layer — a pending/blocking wait-for state and loop-termination (gen_ai.agent.finish_reason) semantics — with looptrip as the deterministic reference implementation. - Phase 8 — (in progress) launch.
Documentation
- Proof — Reproduce the $792.96 headline. Evidence that the fixture is real and reproducible.
- Usage — CLI and library API reference, adapters, and configuration.
- Architecture — Detector design, event normalization, signature matching, and phase-by-phase roadmap.
- Adapters — Implementing a custom adapter for your event source (OTel spans, custom JSON, etc.).
- Testing — Test structure, mutation sanity, fixture integrity, and independent re-derivation.
- Framing — Attribution, cost baselines, related work (Watchtower), and the role of standards.
- Case Studies — Real runaways:
workflow-subagentloops, deadlock scenarios, and non-termination traces. - Contributing — How to contribute, issue triage, and development setup.
License
Apache-2.0. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file looptrip-0.1.2.tar.gz.
File metadata
- Download URL: looptrip-0.1.2.tar.gz
- Upload date:
- Size: 146.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a3174d240eea6784628fb130f6bb65ac3615dae85eeb04e4eea32ecc2b521310
|
|
| MD5 |
454a72e0c0c464cea38281ab82eacc0c
|
|
| BLAKE2b-256 |
ccb5653ac452da81f71c9f4fb9443cee042c9781fb7dd571d42ce612af06233a
|
File details
Details for the file looptrip-0.1.2-py3-none-any.whl.
File metadata
- Download URL: looptrip-0.1.2-py3-none-any.whl
- Upload date:
- Size: 79.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1818f5f247f8fe95cebd39baee70f4edaf8c12d640679d430bbf0ab212ddf151
|
|
| MD5 |
b9c123391f0eb690d0a8ee4bff09b45c
|
|
| BLAKE2b-256 |
01c0f992e009d3be9e71c5f5a3cefea13dd8e2b4d2cbcccc0aca02c06f64f9a0
|