Persistent working memory across agentic CLI sessions — CLI-agnostic MCP server for Claude Code/Desktop, Codex, Gemini, Copilot, VS Code.
Project description
thread-keeper
A local MCP server that holds persistent working memory across agentic CLI sessions — Claude Code, Claude Desktop, OpenAI Codex (CLI + desktop), Google Gemini, GitHub Copilot, and every MCP-aware VS Code extension share one SQLite store, one set of threads, one learning loop, one user model.
The brief format is dense — structural tags, opaque IDs, ~6 KB per session-start injection. Optimized for agent consumption, not human reading.
Why
Today every agent CLI starts cold. Context dies at session boundaries. Skills you taught Claude don't transfer to Codex. Threads you closed in yesterday's Gemini chat are invisible to today's Copilot.
thread-keeper is the substrate underneath:
- One memory store — threads, notes, verbatim quotes, dialectic claims about you. Survives session, restart, CLI swap.
- One learning loop (hermes-style) — closed threads with rich content
spawn a background reviewer that appends lessons to
~/.threadkeeper/lessons.md. Every CLI's per-user instructions file references this path, so the same procedural knowledge surfaces in Claude Code, Codex, Gemini, and Copilot. Claude-specific~/.claude/skills/*/SKILL.mdis an optional secondary output when frontmatter auto-triggering adds value. - Cross-session signaling — broadcast / whisper / inbox / wait between concurrent sessions across different CLIs.
Quickstart
The shortest path — PyPI + pipx (recommended):
pipx install 'threadkeeper[semantic]' && thread-keeper-setup
thread-keeper-setup detects every CLI you have installed (Claude
Code / Claude Desktop / Codex CLI + desktop / Gemini / Copilot / VS
Code), registers the MCP server in each one's config, copies hooks to
~/.threadkeeper/hooks/, and writes a managed instructions block into
each CLI's per-user instructions file (CLAUDE.md / AGENTS.md /
GEMINI.md / copilot-instructions.md — Claude Desktop and VS Code
have no global instructions file, so that step is skipped for them).
Restart your CLI of choice. The SessionStart hook injects a brief on
first message; no manual brief() call required.
Alternative installs
If you don't have pipx and don't want to install it:
# uv (Rust-fast Python tool runner) — no clone, single binary on PATH
uv tool install 'threadkeeper[semantic]' && thread-keeper-setup
# Plain pip into a venv
python3 -m venv ~/.threadkeeper-venv
~/.threadkeeper-venv/bin/pip install 'threadkeeper[semantic]'
~/.threadkeeper-venv/bin/thread-keeper-setup
For development (editable install from a git checkout) or to track the bleeding edge:
# One-liner installer — clones to ~/thread-keeper, makes a venv,
# editable-installs, wires every detected CLI. Idempotent — re-run to
# update (it git-pulls + reinstalls).
curl -fsSL https://raw.githubusercontent.com/po4erk91/thread-keeper/main/install.sh | bash -s -- --semantic
# Or fully manual
git clone https://github.com/po4erk91/thread-keeper ~/thread-keeper
cd ~/thread-keeper && python3 -m venv .venv
.venv/bin/pip install -e '.[semantic]'
.venv/bin/thread-keeper-setup
To preview without writing anything:
thread-keeper-setup --dry-run
Multi-CLI integration
| CLI | MCP config | Instructions file | Hooks | Transcripts ingested |
|---|---|---|---|---|
| Claude Code | ~/.claude.json mcpServers |
~/.claude/CLAUDE.md |
~/.claude/settings.json hooks |
~/.claude/projects/**/*.jsonl |
| Claude Desktop | ~/Library/Application Support/Claude/claude_desktop_config.json mcpServers (macOS); %APPDATA%\Claude\… (Win); ~/.config/Claude/… (Linux) |
none (GUI-only) | not supported by the app | none — chats live in Electron IndexedDB |
| Codex (CLI + desktop) | ~/.codex/config.toml [mcp_servers] (shared between CLI and Codex.app) |
~/.codex/AGENTS.md |
not supported | ~/.codex/sessions/**/rollout-*.jsonl |
| Gemini | ~/.gemini/settings.json mcpServers |
~/.gemini/GEMINI.md |
~/.gemini/settings.json hooks |
~/.gemini/tmp/<user>/chats/session-*.jsonl |
| Copilot | ~/.copilot/mcp-config.json mcpServers |
~/.copilot/copilot-instructions.md |
~/.copilot/hooks.json |
~/.copilot/session-store.db (sqlite) |
| VS Code | ~/Library/Application Support/Code/User/mcp.json servers (macOS); %APPDATA%\Code\User\mcp.json (Win); ~/.config/Code/User/mcp.json (Linux) |
none (per-workspace only) | not supported | none — extensions own their history |
Every CLI that produces parseable transcripts feeds the same
dialog_messages table with a source tag, so dialog_search() finds
matches regardless of where the conversation happened. Claude Desktop
and the VS Code adapter are the exceptions — MCP registration only;
their chats don't reach the table for now (Electron IndexedDB on the
Claude Desktop side; per-extension stores on the VS Code side).
VS Code's user-level mcp.json is the central host that every
MCP-aware VS Code extension consumes — GitHub Copilot Chat, the
Anthropic Claude IDE plugin, the OpenAI Codex IDE plugin, Continue,
Cline, … — so a single registration there reaches all of them at once.
Adding a new CLI = one file under threadkeeper/adapters/ implementing
the CLIAdapter contract. See CONTRIBUTING.md.
Core systems
Spawn — primary parallelism primitive
spawn(prompt, slim=True, role=..., visible=False, ...) launches a child
Claude session via a claude -p subprocess. By default slim=True: the
child loads only the thread-keeper MCP, no embeddings, no third-party
servers. ~500 MB RSS versus ~1.3 GB for a full child. Heuristic for the
parent: N≥2 modular independent units of ≥5 min each = spawn signal.
A daemon measures combined child RSS every 10 s; admission control
refuses a new spawn that would exceed THREADKEEPER_SPAWN_BUDGET_MB
(3 GB default). Slim children that need semantic search delegate to the
parent via search_via_parent — no per-child copy of sentence-transformers.
Learning loop (hermes-style)
Four loops materialize knowledge into Anthropic-style Skill files
(SKILL.md under each detected CLI's skills directory — Claude's
~/.claude/skills/, Codex's ~/.codex/skills/, plus the canonical
~/.threadkeeper/skills/ mirror) with a CLI-agnostic
~/.threadkeeper/lessons.md fallback for CLIs that don't auto-trigger
on the Skill format (Gemini / Copilot / bare MCP clients):
- Auto-review on close_thread — when a closed thread is rich
(≥5 notes, ≥2 insight/move),
close_threadspawns a slim child withSKILL_REVIEW_PROMPT+ the thread's notes. The prompt is rubric-form (Q1–Q5 yes/no) with explicit positive examples for incident-vs-rule classification. The fork also receives a "recently active skills" block so it prefers PATCHing existing umbrellas over creating new ones (Hermes Agent v0.12's active-update bias). Child appends a lesson vialesson_append, optionally mirrors to~/.claude/skills/<name>/SKILL.md, then closes withmark_skill_materialized. Opt in withTHREADKEEPER_AUTO_REVIEW=1. - Shadow-review daemon — every
THREADKEEPER_SHADOW_REVIEW_INTERVAL_Sseconds (default off; 15 min recommended), scans the diff ofdialog_messagessince the last cursor across all CLIs. The window filters internal review-child sessions (no self-pollution) and strips adapter[tool_result]/[tool_call]noise — Hermes v0.12's "clean context" rule. If ≥500 chars of meaningful signal remain, spawns a slim observer child that decides on class-level learning. Idempotent throughevents.kind='shadow_review_pass'. - Extract daemon — every
THREADKEEPER_EXTRACT_INTERVAL_Sseconds (default off; 10 min recommended), scans recentdialog_messageswith heuristic matchers (locale-aware "I want / next time / always" patterns, headers + insight markers, bullet regularities, paraphrase clusters via cosine ≥ 0.80) and enqueues candidates inextract_candidates.status='pending'for the agent to review viareview_candidates()/accept_candidate(). The same self-pollution filter as shadow_review excludes internal review-child sessions. Where shadow extracts CLASS-LEVEL durable rules, extract harvests PER-INCIDENT decision-shaped utterances — sidesteps the empirical problem that agents focused on their primary task don't callnote()/verbatim_user()on their own. - Autonomous Curator — every
THREADKEEPER_CURATOR_INTERVAL_Sseconds (default off; 7 days recommended), spawns a slim child that reviews the EXISTINGlessons.md+skill_usageinventory and writes~/.threadkeeper/curator/REPORT-<isodate>.mdwith KEEP / PATCH / CONSOLIDATE / PRUNE recommendations. Pinned and foreground-authored entries are marked[PROTECTED]in the inventory so the curator never proposes destructive changes against them. Phase 1 is advisory-only — user reviews the REPORT and applies changes manually. Inspired by Hermes Agent v0.12'shermes curatorcron agent.
Dialectic user model
A model of you, accumulated as you use the agent. dialectic_claim,
dialectic_evidence (support / contradict / clarifying),
dialectic_synthesis, dialectic_supersede. Honcho-inspired smoothed
ratio (s-c)/(s+c+3) → low / medium / high / disputed confidence.
Grouped by domain (style, values, workflow, ...) in brief().
i18n bundle
All multilingual regex and prompt fragments live in
threadkeeper/i18n.py — the rest of the codebase stays English-only.
Currently ships ten locales: English, Mandarin Chinese, Hindi,
Spanish, Portuguese, French, German, Arabic, Russian, Japanese
(~82 % of the world's speakers).
Adding a new language is a two-file PR — see CONTRIBUTING.md.
Configuration
The most-used env knobs (full list in threadkeeper/config.py):
| Knob | Default | Purpose |
|---|---|---|
THREADKEEPER_DB |
~/.threadkeeper/db.sqlite |
SQLite file |
THREADKEEPER_AUTO_REVIEW |
"" (off) | auto-review on close_thread |
THREADKEEPER_SHADOW_REVIEW_INTERVAL_S |
0 (off) | shadow daemon tick (s) |
THREADKEEPER_SHADOW_REVIEW_WINDOW_S |
900 | sliding window for shadow scan (s) |
THREADKEEPER_EXTRACT_INTERVAL_S |
0 (off) | extract daemon tick (s); 600 = 10 min recommended |
THREADKEEPER_EXTRACT_WINDOW_MIN |
30 | sliding dialog window per extract pass (min) |
THREADKEEPER_CURATOR_INTERVAL_S |
0 (off) | curator daemon tick (s); 604800 = 7d recommended |
THREADKEEPER_CURATOR_MIN_LESSONS |
3 | min lessons before curator engages |
THREADKEEPER_CURATOR_DESTRUCTIVE |
"" (advisory) | when "1": curator child applies its own PATCH/PRUNE/CONSOLIDATE directly instead of writing advisory REPORT only |
THREADKEEPER_SPAWN_BUDGET_MB |
3072 | combined child RSS cap (MB); 0 disables |
THREADKEEPER_INGEST_INTERVAL_S |
30 | transcript ingest tick (s) |
THREADKEEPER_NO_EMBEDDINGS |
"" | force-disable sentence-transformers |
THREADKEEPER_SKILL_NUDGE_INTERVAL |
10 | events between skill_hint nudges |
Persist them via ~/.claude/settings.json's env block (Claude Code) or
the equivalent env section in each CLI's config. Hot-config reload is
tracked.
Storage
~/.threadkeeper/db.sqlite (overridable via THREADKEEPER_DB). WAL
mode for multi-writer concurrency. Optional notes_vec / dialog_vec
HNSW indexes through sqlite-vec for sub-linear semantic search;
fallback to Python-side cosine when the extension is missing.
One file. Backup = cp. Wipe memory = rm.
Hooks and small runtime artifacts: ~/.threadkeeper/hooks/.
Verifying ingest across CLIs
python scripts/tk_verify_ingest.py
Walks every installed CLI adapter, parses recent transcripts in an isolated tempdir DB, reports per-source message counts and any silent parse failures. Read-only with respect to live state.
Tests
pip install -e '.[semantic,dev]'
python -m pytest
412 tests passing on Python 3.11 / 3.12 / 3.13 (1 skipped). CI runs the suite on every push and PR.
Project layout
threadkeeper/
├── server.py # MCP entry: python -m threadkeeper.server
├── _setup.py # `thread-keeper-setup` installer
├── config.py # env-driven defaults
├── db.py # SQLite schema + sqlite-vec loader
├── identity.py # session, self-cid, daemon launchers
├── ingest.py # adapter-driven transcript ingest
├── brief.py # render_brief / render_context
├── shadow_review.py # autonomous learning observer
├── i18n.py # 10 locales of regex + prompt bundles
├── adapters/ # one file per supported CLI
│ ├── claude_code.py
│ ├── claude_desktop.py
│ ├── codex.py
│ ├── gemini.py
│ ├── copilot.py
│ └── vscode.py
└── tools/ # @mcp.tool entries — 83 of them
├── threads.py
├── peers.py
├── spawn.py
├── skills.py
└── ...
Detailed map in docs/ARCHITECTURE.md. Open work in docs/ROADMAP.md and the Issues tab.
Contributing
PRs welcome — see CONTRIBUTING.md for the project
map, test workflow, and recipes for adding a new CLI adapter or a new
locale. Look for the good-first-issue label.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file threadkeeper-0.4.0.tar.gz.
File metadata
- Download URL: threadkeeper-0.4.0.tar.gz
- Upload date:
- Size: 206.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8567c7326e5577fa78ed51d16078f773cd3654f0aeeb8160729b2914e29a2b9e
|
|
| MD5 |
398b01e9355fcaf51fd6849e07994820
|
|
| BLAKE2b-256 |
c0bf957c50522707f8b97c9a68810b4a9600b8766b24b6b1ad380de50522ff21
|
Provenance
The following attestation bundles were made for threadkeeper-0.4.0.tar.gz:
Publisher:
publish.yml on po4erk91/thread-keeper
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
threadkeeper-0.4.0.tar.gz -
Subject digest:
8567c7326e5577fa78ed51d16078f773cd3654f0aeeb8160729b2914e29a2b9e - Sigstore transparency entry: 1554238868
- Sigstore integration time:
-
Permalink:
po4erk91/thread-keeper@5b3b795daa1216998691ab3c08d7e367a25740f5 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/po4erk91
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5b3b795daa1216998691ab3c08d7e367a25740f5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file threadkeeper-0.4.0-py3-none-any.whl.
File metadata
- Download URL: threadkeeper-0.4.0-py3-none-any.whl
- Upload date:
- Size: 180.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
22229b126dfa49f45f1d7f1aeba3dbbedea1861b1c571ad3f987dd79176cc1d3
|
|
| MD5 |
a2b704a686ba731940c596cb7e859bb1
|
|
| BLAKE2b-256 |
704c00e32bb2afb640873f14675accafd3094f626a692b4d6ed7adc3fc5927bb
|
Provenance
The following attestation bundles were made for threadkeeper-0.4.0-py3-none-any.whl:
Publisher:
publish.yml on po4erk91/thread-keeper
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
threadkeeper-0.4.0-py3-none-any.whl -
Subject digest:
22229b126dfa49f45f1d7f1aeba3dbbedea1861b1c571ad3f987dd79176cc1d3 - Sigstore transparency entry: 1554238940
- Sigstore integration time:
-
Permalink:
po4erk91/thread-keeper@5b3b795daa1216998691ab3c08d7e367a25740f5 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/po4erk91
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5b3b795daa1216998691ab3c08d7e367a25740f5 -
Trigger Event:
push
-
Statement type: