Agent memory with receipts: an MCP server over an append-only, hash-chained ledger for task state, memory, and verified handoffs across sessions and models.

These details have not been verified by PyPI

Project description

Continuity

Continuity is the system of record for agent work: an agent continuity layer that captures task state, memory, decisions, provenance, and handoff context across models, sessions, and tools.

Current product premise: Continuity is not "better memory than a Markdown file" for simple, sequential work. A strong, maintained HANDOFF.md matched Continuity in the Stage A product-falsification benchmark. Continuity's sharper claim is that it is a trust layer for AI work: it turns messy agent history into resumable, inspectable, permissioned operational state, and can compile that state into a compact human-readable handoff when a plain file is the simplest interface.

The event log is the product. Task context, project memory, agent memory, workflow state, gates, timeline rows, and the Console are projections of one append-only, hash-chained ledger.

Every event can carry optional payload-level provenance and usage telemetry: task/session identity, agent/model identity, parent and consumed ledger sequence numbers, handoff source, token counts, usage source, and exact microdollar cost. This metadata stays in the payload layer so the immutable chain remains stable while provenance evolves.

The primary continuation path is structured context compilation: a task-chain projection that returns current task state, ordered chronology, decisions, unresolved conflict signals, and provenance chain. It is available through the Python helper, FastAPI at /projects/{project_id}/tasks/{task_id}/continuation, and MCP as compile_task_context.

Operational conflicts are first-class ledger events. When two agents produce contradictory task assessments, Continuity records an OPERATIONAL_CONFLICT that links the exact event sequences in disagreement. The conflict remains in compiled context, FastAPI at /projects/{project_id}/tasks/{task_id}/conflicts, and MCP until a human records an OPERATIONAL_CONFLICT_RESOLVED event.

Product Falsification Checkpoint

The Phase 6 entry gate compared three continuation paths: no shared context, a strong manually maintained HANDOFF.md, and Continuity via MCP. The 30-run local matrix produced no-context 0/10, strong HANDOFF.md 10/10, and Continuity 10/10, so the result was a tie and durable runner work remains blocked. See docs/testing/2026-06-24-product-falsification-results.md.

The next product-validation target is not another runner feature. It is a Continuity-backed handoff workflow: use the ledger, memory, provenance, validation, and conflict model to generate or verify a compact HANDOFF.md, then test whether that beats a manual handoff under concurrency, stale-context recovery, permissioned context, or audit requirements.

The follow-up IDEA-002 benchmark tested that narrower claim:

CONTINUITY_BENCHMARK_LOCAL_MODEL=qwen3.5:9b make trust-layer-handoff-benchmark

The completed 27-run local matrix produced no-context 0/9, manual handoff 6/9, and Continuity-backed verified handoff 9/9 under the original scoring. That scoring could not credit the manual arm in provenance-audit scenarios even when it honestly reported unverified; scoring was revised on 2026-07-02 to score provenance honesty equally and track ledger-backed verification as a separate capability. The honest claim: a well-maintained manual handoff preserves task facts; Continuity uniquely provides ledger-backed source verification, which a manual handoff is structurally unable to provide. See docs/testing/2026-06-24-trust-layer-handoff-results.md.

Quick Start

As a user (installs the continuity-mcp MCP server command and continuity-handoff receipt exporter):

pip install "git+https://github.com/machinedigital-ai/Continuity.git"
claude mcp add continuity --env CONTINUITY_DB="$HOME/continuity.db" -- continuity-mcp

As a developer (from a checkout):

make setup
make test
make serve

New here? The tutorial walks through the full loop in ~15 minutes: connect an agent, record work, resume in a fresh session or a different model, export the verified receipt, and read the receipt fields.

Then open the read-only Console:

http://127.0.0.1:8000/console

Common Commands

make setup   # create/update continuity-core/.venv and install requirements
make test    # run the test suite
make serve   # run FastAPI at http://127.0.0.1:8000
make mcp     # run the MCP server
make demo    # run the recursive proof demo
make proof   # export and verify the multi-model proof artifact
make ollama-chain # run the local multi-model Ollama chain proof
make codex-persistence-proof  # run two isolated Codex sessions through Continuity
make claude-persistence-proof # run two isolated Claude sessions through Continuity
make cross-agent-persistence-proof # run the Codex-to-Claude handoff proof
make trust-layer-handoff-benchmark # run the IDEA-002 verified handoff benchmark
make list-verified-handoff-tasks   # list exportable task IDs from a ledger
make export-verified-handoff       # export verified HANDOFF.md from a task ledger

Proof Artifact

The completed Phase 4 Trust and Proof work exports a shareable Continuity proof artifact from real ledger events:

make proof

The command writes examples/multi_model_code_review.jsonl, a permissioned internal artifact showing a real multi-agent review sequence: external review, Codex triage, operational conflict, human resolution, gate approval, selected timeline, continuation context, provenance, and ledger integrity.

The artifact intentionally exports sanitized summary payloads. It retains original ledger hashes and verifies exported chain-entry metadata, but it is not yet a standalone public notary proof for omitted ledger events.

Local Model Proofs

The local Ollama chain proof tests continuity across installed local models:

make ollama-chain

The dated finding and corrected targeted rerun are recorded in docs/testing/2026-06-19-local-ollama-chain-proof.md. Across three corrected runs, Continuity achieved 18/18 context-fidelity assertions, 9/9 provenance assertions, 9/9 exact model outputs, and 3/3 valid ledgers using independent Qwen, Gemma, and Ministral model families.

Persistent Agent Proofs

The same-agent proof harness starts two fresh client processes connected only through a dedicated Continuity SQLite ledger. It creates a random challenge after Session A exits, gives Session B only stable project/task/agent IDs, then verifies the output, linked validation, completion turn, ledger integrity, and sanitized proof artifact from recorded events.

make codex-persistence-proof
make claude-persistence-proof
make cross-agent-persistence-proof

Codex runs ephemerally with native memories disabled. Claude runs without session persistence and with only the explicit Continuity MCP configuration. See the persistent-agent proof runbook for exact boundaries, authentication checks, and troubleshooting. The cross-agent mode stores its post-Codex challenge in shared project memory and requires Claude's output and completion turn to carry Codex handoff provenance.

Product Falsification Gate

Before Phase 6 runner work, Continuity is compared against no context and a strong structured HANDOFF.md. Stage A runs 30 fresh local targets:

CONTINUITY_BENCHMARK_LOCAL_MODEL=qwen3.5:9b make product-falsification-stage-a

All Stage A arms use the same direct Ollama transport. The harness reads the handoff or retrieves Continuity context through the real MCP stdio server before the fresh call, isolating context quality from client tool-use behavior.

Results are written to continuity-core/examples/product_falsification_results.jsonl and docs/testing/2026-06-24-product-falsification-results.md. Stage B runs with fresh Codex and Claude clients only when Stage A passes the pre-registered rules. A tie or loss keeps Phase 6 blocked and triggers simplification or repositioning; it is not treated as a Continuity win.

The completed Stage A result was a tie: no context 0/10, strong HANDOFF.md 10/10, and Continuity 10/10. Stage B was therefore skipped and Phase 6 runner work remains blocked. See the dated benchmark report and sanitized result rows.

The completed IDEA-002 trust-layer handoff benchmark validated the verified handoff wedge, not the durable runner: continuation quality was comparable to a maintained manual handoff, and only Continuity satisfied ledger-backed source verification (see the scoring revision note in the dated results doc).

The first productized verified handoff surface is now available:

Python: continuity.handoff.build_verified_handoff(store, project_id=..., task_id=...)
FastAPI: GET /projects/{project_id}/tasks/{task_id}/verified-handoff
MCP: export_verified_handoff(project_id, task_id)
Installed CLI:

continuity-handoff --db continuity-core/continuity.db --list-tasks

continuity-handoff \
  --db continuity-core/continuity.db \
  --project-id your-project \
  --task-id your-task \
  --out HANDOFF.md

The exporter renders human-readable Markdown with current task state, decisions, rejected/superseded signals, unresolved conflicts, and traceable ledger_seq / event_hash source rows. provenance_status is computed from a full ledger integrity check at export time: a valid chain renders verified with the tip hash and Merkle root; a tampered ledger renders integrity_failed with the broken sequence. It preserves the source of truth in the ledger and does not unblock durable runner infrastructure.

Capability Backtest

The current evidence supports this scoped claim:

Continuity gives AI teams verified handoff, task verification, and persistent memory across agents, sessions, tools, and models.

The capability backtest in docs/testing/2026-07-01-continuity-capability-backtest.md distinguishes what Continuity already captures from what the verified handoff markdown currently renders:

Capability	Captured Today	Rendered In Verified Handoff Today
task state	yes	yes
agent identity	yes	partial
model identity	yes	no
session identity	yes	no
handoff source	yes	no
parent/consumed source chain	yes	no
model call/output source	yes	partial
validation source	yes	partial
memory source	yes	partial
tool/action source	partial	partial

The current gap is projection/rendering, not capture. Continuity should not add new capture logic until a test proves existing ledger data cannot answer the product question. The system does not claim endpoint observability, shadow-agent detection, or automatic capture of tools/actions outside Continuity.

Additional explicit non-claims:

The ledger proves event order and immutability, not authorship. Actor and agent identity are process/MCP/repo-local attribution, not cryptographic or authenticated identity.
consumed_seqs currently records the prior context available to an event, not a selective causal proof of what shaped it. Selective grounding exists only for memory (grounded_in_seqs) and linked validations.
Host support means MCP-compatible hosts. Claude Code and Codex paths are proven by isolated tests; other hosts are untested until a dated proof says otherwise.

Model Adapters

The core test suite does not call external model providers. Provider SDKs are optional and imported lazily by their adapters.

OpenAI/OpenAI-compatible endpoints use OPENAI_API_KEY and optional OPENAI_BASE_URL.
Anthropic uses ANTHROPIC_API_KEY.
Ollama uses local HTTP by default at http://localhost:11434, includes a request timeout, and raises AdapterError with contextual failures instead of hanging or leaking low-level urllib errors.

Repository Layout

continuity-core/
  continuity/       # ledger, projections, API, MCP, Console, adapters
  scripts/          # demos and proof scripts
  tests/            # pytest suite
  README.md         # core package details

docs/
  README.md         # documentation authority map
  strategy/         # current strategic background; roadmap remains authoritative
  strategy/archive/ # older strategy/research source material
  superpowers/      # historical design specs and implementation plans
  testing/           # evidence notes from local and integration tests

examples/           # shareable proof artifacts generated from real ledger data

AGENTS.md
architecture_decisions.md
ROADMAP_AND_HANDOFF.md

Plans

Roadmap & Handoff is the canonical execution tracker and current Phase 0-7 sequence. Its Execution Checkpoint controls current implementation work.
Documentation Map explains which docs are authoritative, historical, or evidence-only.
Master Adversarial Review Prompt can be given to Claude/Kael, the GitHub agent, Codex, or another reviewer.
Idea Backlog captures ideas without promoting them into active execution.
Continuity Implementation Plan v3.4 preserves strategic background and the historical phase labels used during its adversarial review.

CI

GitHub Actions runs the Python test suite on pushes and pull requests to main. If Actions are disabled in repository settings, enable them once; no manual test initiation is otherwise required.

License

Apache-2.0. See LICENSE and NOTICE.

Notes

Runtime artifacts such as SQLite databases and exported ledgers are ignored under continuity-core/.
The known Starlette/FastAPI TestClient deprecation warning is third-party dependency churn and does not indicate a Continuity test failure.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Jul 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

continuity_mcp-0.1.0.tar.gz (48.9 kB view details)

Uploaded Jul 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

continuity_mcp-0.1.0-py3-none-any.whl (53.5 kB view details)

Uploaded Jul 3, 2026 Python 3

File details

Details for the file continuity_mcp-0.1.0.tar.gz.

File metadata

Download URL: continuity_mcp-0.1.0.tar.gz
Upload date: Jul 3, 2026
Size: 48.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for continuity_mcp-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`adccbc7a793802df41f01361b57787f1de26e8302d2b76f81f5a9b7baa2d9de4`
MD5	`734fdd516e88676638135e778a9940fc`
BLAKE2b-256	`e2a4ae5904f4e9635895242f7ab142db170ed0c829972cf6a35bf105b53b03c6`

See more details on using hashes here.

File details

Details for the file continuity_mcp-0.1.0-py3-none-any.whl.

File metadata

Download URL: continuity_mcp-0.1.0-py3-none-any.whl
Upload date: Jul 3, 2026
Size: 53.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for continuity_mcp-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e80593d7819db4435caab7454f907c94fb947dd7a18200119f0ea927a93c687c`
MD5	`721b63c8653e301e3b5b24094af96502`
BLAKE2b-256	`d8470e65d631a8f44df18d762f531d7dfda09e5799a1de1411b545c0f474e477`

See more details on using hashes here.

continuity-mcp 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Continuity

Product Falsification Checkpoint

Quick Start

Common Commands

Proof Artifact

Local Model Proofs

Persistent Agent Proofs

Product Falsification Gate

Capability Backtest

Model Adapters

Repository Layout

Plans

CI

License

Notes

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes