Token optimization layer for multi-agent LangGraph systems — cut shared-artifact token costs via MESI cache coherence, one import change

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

MrVlad

These details have not been verified by PyPI

Project description

agent-coherence

When two agents share state, one of them is usually reading a stale copy. agent-coherence makes that visible — and serves the fresh version on the next read instead of rebroadcasting the full artifact every turn.

pip install "agent-coherence[langgraph]"

# Before
from langgraph.store.memory import InMemoryStore
store = InMemoryStore()

# After — one import change, no other code changes
from ccs.adapters import CCSStore
store = CCSStore(strategy="lazy")

That's it. Node code stays identical; store.get(), store.put(), store.search() still work the same. The savings show up immediately on any workload where multiple agents read the same artifact more often than they write it.

$ python -m examples.shared_codebase.main

Example: 4-agent shared-codebase code review

  style_reviewer: 8 files scanned, 4 re-read, findings written
  security_reviewer: 8 files scanned, 4 re-read, findings written
  architecture_reviewer: 8 files scanned, 4 re-read, findings written
  synthesizer: 3 findings read, context re-read (12 issues total)

  CCSStore Benchmark Summary
  ──────────────────────────────────────
  Baseline tokens (no cache):     44702
  CCSStore tokens:                27882
  Tokens saved:                   16820
  Token reduction:                37.6%
  Cache hit rate:                35.3%  (51 get ops)

Saving 16,820 tokens at $3/MTok = $0.050 per run. At 1,000 runs/day: $18K/year on one codebase-review workload.

Baseline: tokens you would pay if every agent re-read every shared artifact from scratch — equivalent to a graph without cross-agent caching. This is what InMemoryStore effectively does.

🔧 User guide — installation, strategies, observability, telemetry, examples, full API reference
📊 Real benchmarks — measured on actual LangGraph graphs
🔍 Why coherence matters — the gap across LangGraph, CrewAI, AutoGen, and Claude Agent SDK, with citations
📄 Paper on arXiv (2603.15183) — formal protocol, TLA+ verification, simulation results

How it works

Each shared artifact is cached locally per agent and reads serve from the local cache when that copy is fresh. Writes commit to a coordinator, which sends lightweight invalidation signals (~12 tokens) to peers so the next read fetches the new version instead of rebroadcasting the full artifact. Consistency is single-writer-multiple-reader per artifact with bounded staleness — peers re-fetch on next read.

Five synchronization strategies ship out of the box: lazy (default), eager, lease (TTL-based), access_count, and broadcast. Pick the one that matches your workload's read/write ratio and freshness needs; see the strategies table for guidance.

Quick start

Namespace convention. namespace[0] is the agent identity; namespace[1:] is the artifact scope. Two agents writing to ("planner", "shared") and ("reviewer", "shared") address the same artifact.

from ccs.adapters import CCSStore

store = CCSStore(strategy="lazy")

# planner writes
store.put(("planner", "shared"), "plan", {"step": 1})

# reviewer reads — same artifact, version 1
store.get(("reviewer", "shared"), "plan")

Token-savings telemetry. Pass benchmark=True to measure savings on your own graph, or on_metric=callback for per-operation events. Pass telemetry="opentelemetry" or "langsmith" to forward into your existing observability stack.

store = CCSStore(strategy="lazy", benchmark=True)
# ... run your graph ...
store.print_benchmark_summary()

Crash recovery. When an agent crashes (OOM-kill, segfault) or livelocks holding a write grant, the coordinator reclaims it on a heartbeat-based sweep so other agents can proceed:

from ccs.adapters import CCSStore
from ccs.coordinator.service import CrashRecoveryConfig

store = CCSStore(
    strategy="lazy",
    crash_recovery=CrashRecoveryConfig(
        enabled=True,
        heartbeat_timeout_ticks=10,
        max_hold_ticks=1000,
    ),
)

# Heartbeats piggyback on every read/write/batch automatically.
# After a process restart, call recover() to flush stale cache:
store.recover(agent_name="planner", now_tick=current_tick)

The same crash_recovery= kwarg works on LangGraphAdapter, CrewAIAdapter, AutoGenAdapter, and CoherenceAdapterCore. Default is enabled=False, opt-in for now.

See docs/guide.md for the full guide: namespace convention, strategies, observability, state transitions log, content audit log, crash recovery, telemetry, graceful degradation, examples, and API reference.

Real-workload benchmarks

Measured on real LangGraph StateGraph executions using GenericFakeChatModel with no live LLM API calls, so the results are reproducible in CI. Run them yourself:

pip install "agent-coherence[langgraph,benchmark]"
make benchmark    # runs all three workloads, prints consolidated table

Or run individually:

python benchmarks/langgraph_real/bench_planner.py
python benchmarks/langgraph_real/bench_code_review.py
python benchmarks/langgraph_real/bench_high_churn.py

Savings scale with read/write ratio:

Workload	Agents	Reads:Writes	Hit rate	Baseline tokens	CCSStore tokens	Savings
Planning (read-heavy)	4	12:1	75%	4,160	1,301	69%
Code review (moderate)	3	8:3	60%	5,320	2,835	47%
High-churn (write-heavy)	4	8:4	50%	3,250	2,317	29%

For protocol-only simulation methodology, see REPRODUCE.md.

Benchmark your own workload

pip install "agent-coherence[langgraph,benchmark]"
ccs-benchmark --graph path/to/your_graph.py:build_graph

The factory must accept a single store argument and return a compiled LangGraph graph (builder.compile(store=store)). The CLI runs the graph once and prints a token savings summary. Use --initial-state '{"key": "value"}' to pass a custom input dict.

Architecture

Protocol (ccs.core, ccs.strategies) — coherence state machine and synchronization strategies; no framework dependencies.
Coordinator (ccs.coordinator) — authority service tracking directory state, publishing invalidations, and reclaiming stale grants (crash recovery).
Adapters (ccs.adapters) — framework integrations for LangGraph, CrewAI, and AutoGen; ~100 lines each. Each adapter exposes heartbeat() and recover() for crash-recovery liveness.
Simulation (ccs.simulation) — deterministic tick-driven engine for scenario benchmarks with failure injection (kill, busy, restore).
Event bus (ccs.bus) — pluggable transport for invalidation signals; in-memory by default, swap in Redis, Kafka, NATS, or gRPC streams for production.

Formal verification

Protocol safety properties (single-writer, monotonic versioning, crash-recovery sweep invariants) are model-checked with TLA+/TLC. The tla-check CI job runs TLC on every push and PR.

Status

v0.6 released. See releases for full history. Alpha — APIs may change before v1.0.

What's new in v0.6 — crash recovery for stale grants. When an agent crashes (OOM-kill, segfault) or livelocks, its MODIFIED or EXCLUSIVE grant blocks every other agent from writing the same artifact. v0.6 reclaims those grants automatically: piggyback heartbeats on every read/write, an enforce_stable_grant_timeouts sweep on the coordinator, and a recover() primitive on every adapter for post-restart cache invalidation. Two reclaim triggers — reclaim_heartbeat (holder went silent) and reclaim_max_hold (held too long regardless of liveness) — surface in the state log so production incidents leave a trail. Composition fail-fast: lease strategy + crash recovery requires max_hold_ticks > lease_ttl_ticks or it raises at startup. Behind feature flag (CrashRecoveryConfig(enabled=False) default) for now; flip is the next deliberate release after dogfood validation. Every framework adapter — LangGraph, CrewAI, AutoGen, and CCSStore — accepts crash_recovery=CrashRecoveryConfig(...) and exposes heartbeat() / recover().

v0.5 — per-agent content audit log. Opt-in content_audit_log=callback records every content delivery (cache hit, fetch, broadcast, write, search) with SHA-256 hashes, gap-free sequence numbers, and instance_id cross-validated against the state log. Pairs with v0.4's state_log to give debuggers a complete picture: state transitions × content delivered.

v0.4 — sequence-numbered event stream. sequence_number, instance_id, schema_version on every state-log entry. ccs.validation.validate_log helper for gap and schema-drift detection.

v0.3 — state transitions log + reproducible benchmark harness. Opt-in JSONL stream of every stable MESI state transition. make benchmark harness with committed baseline (benchmarks/expected.json).

v0.2 — inline benchmark + telemetry + degradation visibility. benchmark=True, print_benchmark_summary(), CoherenceDegradedWarning, OTel and LangSmith adapters, graceful degradation via on_error="degrade".

v0.1 — initial release. MESI-style cache coherence for shared artifacts in multi-agent LLM systems.

Paper

Token Coherence: Adapting MESI Cache Protocols to Minimize Synchronization Overhead in Multi-Agent LLM Systems arXiv:2603.15183

BibTeX

@article{parakhin2026token,
  title   = {Token Coherence: Adapting MESI Cache Protocols to Minimize
             Synchronization Overhead in Multi-Agent LLM Systems},
  author  = {Parakhin, Vladyslav},
  journal = {arXiv preprint arXiv:2603.15183},
  year    = {2026}
}

Debugging multi-agent failures often comes down to which agent saw what state when. CCSStore(content_audit_log=my_callback) records every content delivery — cache hits, fetches, broadcasts, writes, and searches — with SHA-256 hashes and gap-free sequence numbers. The state log tracks MESI transitions; the audit log tracks what content each agent actually saw. If you've hit a stale-read bug in a multi-agent workflow, I'd like to hear about it — open an issue.

Community

Questions, war stories, and ideas welcome in Discussions.

License

Apache-2.0. See LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

MrVlad

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.9.3

Jun 15, 2026

0.9.2

Jun 11, 2026

0.9.1

Jun 10, 2026

0.9.0

Jun 7, 2026

0.8.4.3

Jun 6, 2026

0.8.4.2

Jun 6, 2026

0.8.4.1

Jun 5, 2026

0.8.4

Jun 2, 2026

0.8.3

May 29, 2026

0.8.2

May 28, 2026

0.8.1

May 27, 2026

0.8.0

May 23, 2026

0.8.0a1 pre-release

May 18, 2026

0.7.1

May 13, 2026

0.7.0

May 11, 2026

This version

0.6.0

May 9, 2026

0.5.0

May 7, 2026

0.4.1

May 6, 2026

0.3.0

May 5, 2026

0.2.0

Apr 26, 2026

0.1.0

Mar 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_coherence-0.6.0.tar.gz (97.5 kB view details)

Uploaded May 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_coherence-0.6.0-py3-none-any.whl (76.0 kB view details)

Uploaded May 9, 2026 Python 3

File details

Details for the file agent_coherence-0.6.0.tar.gz.

File metadata

Download URL: agent_coherence-0.6.0.tar.gz
Upload date: May 9, 2026
Size: 97.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agent_coherence-0.6.0.tar.gz
Algorithm	Hash digest
SHA256	`e5c074e484d025fdee8cf53511b5ac02881f6161adf66b938dbc96f7af53ac39`
MD5	`1add109502a082f0078c64609a220cdf`
BLAKE2b-256	`360db569cf1674afe9980dc93a2a364a0ac9663711ef145f06bceb44b87d65cf`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_coherence-0.6.0.tar.gz:

Publisher: publish.yml on hipvlady/agent-coherence

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agent_coherence-0.6.0.tar.gz
- Subject digest: e5c074e484d025fdee8cf53511b5ac02881f6161adf66b938dbc96f7af53ac39
- Sigstore transparency entry: 1486266010
- Sigstore integration time: May 9, 2026
Source repository:
- Permalink: hipvlady/agent-coherence@76e26993fb3b89e4ced976a54763f4066bc365e0
- Branch / Tag: refs/tags/v0.6.0
- Owner: https://github.com/hipvlady
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@76e26993fb3b89e4ced976a54763f4066bc365e0
- Trigger Event: push

File details

Details for the file agent_coherence-0.6.0-py3-none-any.whl.

File metadata

Download URL: agent_coherence-0.6.0-py3-none-any.whl
Upload date: May 9, 2026
Size: 76.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agent_coherence-0.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d983c26725bbc0160035aab804496fc93a518c546e7847eeb5e7ea1b86755c13`
MD5	`77f8f97de35991ad817c4940f02ed498`
BLAKE2b-256	`d00ed3c8a33ee083e8206f203c9beb81d677714613597fe7054c66ed2e0417e8`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_coherence-0.6.0-py3-none-any.whl:

Publisher: publish.yml on hipvlady/agent-coherence

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agent_coherence-0.6.0-py3-none-any.whl
- Subject digest: d983c26725bbc0160035aab804496fc93a518c546e7847eeb5e7ea1b86755c13
- Sigstore transparency entry: 1486266056
- Sigstore integration time: May 9, 2026
Source repository:
- Permalink: hipvlady/agent-coherence@76e26993fb3b89e4ced976a54763f4066bc365e0
- Branch / Tag: refs/tags/v0.6.0
- Owner: https://github.com/hipvlady
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@76e26993fb3b89e4ced976a54763f4066bc365e0
- Trigger Event: push

agent-coherence 0.6.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

agent-coherence

How it works

Quick start

Real-workload benchmarks

Benchmark your own workload

Architecture

Formal verification

Status

Paper

Community

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance