Local-first reliability ledger for AI coding-agent work
Project description
Chimera Memory
Local-first reliability ledger for AI coding-agent work.
Chimera Memory records what an agent tried, which command checked it, what happened, and what receipt proves it. It runs entirely on your machine.
What it records
Each wrapped verification command produces a claim with:
session_id— which work session it belongs toagent_id / model_version / harness_id— who ran it and in what tooltask_type— what kind of work (test,lint,type,docs, …)VALIDATEDorCONTRADICTED— did reality agree with the prediction?stdout_excerpt / stderr_excerpt— bounded output witness when available- git state at time of claim
Sessions, claims, and outcomes are stored in .chimera-memory/ as append-only JSONL files. Each new claim gets an integrity chain entry in .chimera-memory/integrity.jsonl.
Getting started
See docs/strategy/chimera-memory-first-run-quickstart.md for a step-by-step guide.
Key limitations (v0.6):
- M2B drift scoring: not built
- Model ranking or routing: not built
- Hosted/cloud sync: not built
- Evidence write import: dry-run only
- Windows: not tested
Run from the repo root:
uv run chimera-memory --help
Standalone local install (v0.6+)
As of v0.6, local wheel builds work outside the monorepo. Build and install:
# From the repo root, build local wheels for the two public packages
uv build packages/chimera-memory-types --out-dir /tmp/cm-dist
uv build packages/chimera-memory --out-dir /tmp/cm-dist
# In any Python 3.12+ environment
pip install /tmp/cm-dist/*.whl
chimera-memory --help
Runtime dependencies installed automatically: pydantic, filelock.
Note: Public PyPI publishing has not happened yet. This is local packaging readiness only. Hosted/cloud sync, team SaaS, and remote substrate writes are not implemented. Reliability is not model routing. M2B is not implemented.
Quickstart
# 1. Start a session
uv run chimera-memory session start \
--branch feat/my-branch \
--task-label "fix-type-errors" \
--agent kiro \
--model claude-sonnet-4.6 \
--harness-id kiro-cli
# 2. Wrap real verification commands
uv run chimera-memory wrap --task-type type -- \
uv run mypy packages/chimera-memory/src --ignore-missing-imports
uv run chimera-memory wrap --task-type test -- \
uv run pytest packages/chimera-memory/tests -q
uv run chimera-memory wrap --task-type lint -- \
uv run ruff check packages/chimera-memory
uv run chimera-memory session end --status PASSED
# 3. Review
uv run chimera-memory failures # see what failed, with witness output
uv run chimera-memory status # dogfood gate progress
uv run chimera-memory receipt latest --markdown # share-ready proof artifact
The -- separator is required before non-pytest commands to tell the argument parser where the wrapped command begins.
Common commands
chimera-memory session start --branch ... --task-label ... --agent ... --model ... --harness-id ...
chimera-memory wrap --task-type <type> [-- <command>]
chimera-memory session end --status PASSED|FAILED|MIXED|INTERRUPTED
chimera-memory failures [--json]
chimera-memory status [--json]
chimera-memory verify [--json]
chimera-memory receipt latest [--json] [--markdown]
chimera-memory receipt show <session_id> [--json] [--markdown]
chimera-memory export --clean-only --output <path>
chimera-memory session list
chimera-memory report # raw reliability groups (use status for gate progress)
Demo: real failure-fix loop
During development, mypy found a real type error:
chimera_memory/cli.py:164: error: Value of type "object" is not indexable [index]
Chimera Memory recorded the mypy run as CONTRADICTED, with the error stored as stdout_excerpt. The code was fixed. The same mypy command ran again and settled as VALIDATED. A single session receipt showed both runs.
uv run mypy ... [type] → CONTRADICTED
uv run mypy ... [type] → VALIDATED
This is the core loop: reality contradicted the agent, the fix was applied, and the correction was verified — all in one session, with attribution.
Reliability command
chimera-memory reliability
chimera-memory reliability --json
Read-only. Reports raw validation rates from settled clean claims, grouped by agent/model/task. Does not rank models, route work, or make autonomy decisions. Failure-quality classification (organic vs synthetic) is not yet stored in claim metadata — rates include all CONTRADICTED outcomes.
Bridge pipeline (dry-run only)
Export clean evidence events and preview engine ingestion without any writes:
# Export settled, attributed claims as JSONL
uv run chimera-memory export --clean-only --output /tmp/cm-clean-events.jsonl
# Validate the export schema
uv run python -m tools.chimera_memory_ingest_dry_run /tmp/cm-clean-events.jsonl
# Map to engine evidence candidates (no database/substrate writes)
uv run python -m tools.chimera_memory_engine_adapter_dry_run /tmp/cm-clean-events.jsonl
Both bridge tools are dry-run only. writes_performed is always false.
Two-model evidence (local v0.2)
As of v0.2, the store contains evidence from two real AI coding agents:
| Agent | Model | Claims | Real failures |
|---|---|---|---|
| kiro | claude-sonnet-4.6 | 123 | 2 organic |
| codebuff | mimo-v2.5-pro | 12 | 2 real |
manual and planning-agent entries also exist for workflow/planning tasks.
M2 comparative reliability scoring is not yet built. The data is structurally ready for it once codebuff accumulates ≥25 claims with ≥5 organic failures.
Integrity
New claims are hash-chained into .chimera-memory/integrity.jsonl. Run:
uv run chimera-memory verify
uv run chimera-memory verify --json
Historical records created before the integrity layer was added are reported
as LEGACY_UNSIGNED. This is honest — they are not broken, just unchained.
verify reports BROKEN only if a chained record's hash doesn't match or a
claim was appended without a corresponding integrity entry.
Single-process assumption: the store is designed for single-process local use. Truly concurrent writes from separate processes can race; this is not hardened against. Local CLI use is always single-process.
Known invocation gotcha
The mypy task type wrap must be invoked with only the adapter tool (not the
importer) to avoid a "source file found twice" error:
# Correct — adapter transitively pulls in importer
uv run chimera-memory wrap --task-type type -- \
uv run mypy packages/chimera-memory/src \
tools/chimera_memory_engine_adapter_dry_run.py \
--ignore-missing-imports
# Incorrect — causes "source file found twice" mypy error
uv run chimera-memory wrap --task-type type -- \
uv run mypy packages/chimera-memory/src \
tools/chimera_memory_ingest_dry_run.py \
tools/chimera_memory_engine_adapter_dry_run.py \
--ignore-missing-imports
Release status
- Local wheel proof exists. The 2-package install (
chimera-memory+chimera-memory-types) works in a fresh venv outside the monorepo. - Public PyPI: not published. TestPyPI and private registry publishing have not been done.
- License decision pending. No open-source license has been formally assigned yet.
- Witness output is redacted by default.
chimera-memory wrapredacts secrets (API keys, tokens, passwords, private keys) from captured stdout/stderr before storing. Review receipts before sharing. - Command argument redaction. The command string in receipts is also redacted for common secret patterns. However, do not pass literal secret values as command arguments — use environment variables instead (e.g.
MY_TOKEN=secret uv run mypy ...).
Platform support
- macOS and Linux are the target platforms. Tested on macOS arm64.
- Windows is not supported. Write-path locking uses
filelock(cross-platform), but Windows has not been tested end-to-end. - Linux Docker smoke test is pending (Docker unavailable in current environment).
What is not built
The following are explicitly out of scope for v0.6:
- M2B statistical drift/trend analysis (advisory heuristic only)
- Routing, autonomy decisions, or model ranking
- Cloud/hosted sync or team sharing
- Dashboard, GitHub Action, or CI PR comments
- ORIAS, trading/finance verticals
- GraphSource/substrate writes
Current limitations
- M2 drift/model comparison is not fully built. An advisory drift heuristic exists (
chimera-memory drift), but statistical M2B drift/trend analysis is not built.statusshows segment counts, not trends. - Useful reliability patterns require real failure variance. A store of only
VALIDATEDclaims proves capture works, not that any agent is reliable. - The workflow is CLI/manual, not always-on. You must start sessions and wrap commands explicitly.
- Local-first. Nothing leaves your machine. Data lives in
.chimera-memory/inside the repo. - Synthetic failures should not be treated as product evidence. Only naturally occurring failures count.
- Single-process writes only. The integrity chain is not safe for concurrent multi-process writes.
More detail
See the full demo quickstart with the failure-fix loop walkthrough:
../../docs/strategy/chimera-memory-demo-quickstart-2026-06-04.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chimera_memory-0.1.0.tar.gz.
File metadata
- Download URL: chimera_memory-0.1.0.tar.gz
- Upload date:
- Size: 121.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d187a7338baeef526af85517ee1d1b811453efa034cd2f762d9df5ce4952534a
|
|
| MD5 |
10dd04f3de15b7b446e5baaef410ee4d
|
|
| BLAKE2b-256 |
5595489582b5edbab386a7790333ed4e9e127a063b6edad422ea1d6b44244bfc
|
File details
Details for the file chimera_memory-0.1.0-py3-none-any.whl.
File metadata
- Download URL: chimera_memory-0.1.0-py3-none-any.whl
- Upload date:
- Size: 76.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d26290f43d3897ff00476721b8348faaba7f80010d78033e2ef7c02918518e9
|
|
| MD5 |
84176d9d19aaa900d11c914c26685442
|
|
| BLAKE2b-256 |
a6b5476d47fa0e5669575a81f1bba53b23583f75882fff4e5fbe23c3e564eb44
|