Persistent hybrid memory for AI coding assistants
Project description
claude-memory
Persistent hybrid memory for AI coding assistants.
AI coding assistants forget everything between sessions. You explain your architecture on Monday, and by Tuesday it's asking what framework you use. claude-memory fixes this by giving your assistant a persistent, searchable memory that loads the right context automatically.
It combines vector similarity search (ChromaDB) with keyword matching (BM25-style scoring) and a biological memory model -- five mechanisms inspired by how human memory actually works: temporal decay, evergreen exemptions, salience weighting, retrieval strengthening, and consolidation. A delta-sync indexer uses SHA-256 hashes to skip unchanged files, making re-indexing fast enough to run on every boot. Daily logs and periodic notes (weekly/monthly/quarterly/yearly) give each new session immediate context about what happened recently.
How It Works
+------------------+
| Your Notes / |
| Docs / Vault |
+--------+---------+
|
claude-memory index
(delta sync: SHA-256)
|
+--------v---------+
| ChromaDB |
| (vector store) |
+--------+---------+
|
claude-memory search
(hybrid: vector + BM25)
|
+-------------+-------------+
| |
Vector Similarity Keyword Match
(semantic meaning) (exact terms)
| |
+-------------+-------------+
|
Temporal Decay
(recent = higher rank,
evergreen = no decay)
|
+--------v---------+
| Ranked Results |
| (JSON or human |
| readable) |
+------------------+
The indexer watches your markdown files. When something changes, it re-chunks and re-embeds only the changed files. Search combines two signals -- vector similarity catches semantic matches ("deployment process" finds "how to ship to production") while keyword scoring catches exact terms that embeddings sometimes miss.
The Biological Memory Model
The scoring pipeline applies five mechanisms modeled after biological memory:
-
Temporal Decay (Ebbinghaus forgetting curve) -- Exponential penalty based on file age. Half-life defaults to 30 days: a month-old file scores 50% of a fresh one.
-
Evergreen Exemptions (neocortical long-term storage) -- Architecture docs, decision records, and config files are exempt from decay. They always rank at full strength.
-
Salience Weighting (amygdala tagging) -- Important files decay slower. A file with salience 5x effectively ages 5x slower.
BRIEFING.mdat salience 5.0 means a 150-day-old file decays like it's 30 days old. -
Retrieval Strengthening (spaced repetition) -- Every time a file appears in search results, it gets a 5% boost (capped at 30%). Files you keep coming back to stay accessible -- just like memories that get reinforced through recall.
-
Consolidation Bonus (hippocampal replay) -- Files mentioned in periodic notes (weekly, monthly, quarterly, yearly) get a 15% boost per level, capped at 50%. If something makes it into your quarterly review, it's clearly important.
The combined formula: score = base * decay(effective_age) * (1 + retrieval_boost) * (1 + consolidation_bonus) where effective_age = raw_age / salience.
Quick Start
Install
pip install claude-memory
Or install from source:
git clone https://github.com/Haustorium12/claude-memory.git
cd claude-memory
pip install -e .
Initialize
claude-memory init
This creates ~/.claude-memory/config.json and the data directories.
Configure Your Paths
Edit ~/.claude-memory/config.json and add the directories you want indexed:
{
"index_paths": [
{"path": "/home/you/notes", "depth": 3},
{"path": "/home/you/projects/docs", "depth": 2}
]
}
Index
claude-memory index # Delta sync (fast -- only changed files)
claude-memory index --full # Full rebuild from scratch
claude-memory index --check # Dry run -- show what would change
claude-memory index --stats # Show index statistics
Search
claude-memory search "deployment process"
claude-memory search "auth bug" -c project
claude-memory search "meeting notes" -n 5
claude-memory search "api design" --json # For piping to other tools
claude-memory search "architecture" --no-decay # Ignore temporal decay
claude-memory search "database" --vector-only # Pure semantic search
claude-memory search "TODO" --keyword-only # Pure keyword search
claude-memory access-stats # Show retrieval & consolidation stats
Daily Logs
claude-memory daily --init # Create today's log
claude-memory daily --append "Fixed auth bug" # Log an event
claude-memory daily --load # Show today + yesterday
claude-memory daily --today # Show today only
Periodic Notes
Periodic notes synthesize daily logs into higher-level summaries: weekly, monthly, quarterly, and yearly. Each level pulls from the one below it -- weekly notes pull from daily logs, monthly from weekly, and so on.
# Show/create current period notes
claude-memory weekly # This week's note
claude-memory monthly # This month's note
claude-memory quarterly # This quarter's note
claude-memory yearly # This year's note
# Synthesize from lower-level notes
claude-memory weekly --synthesize # Pull daily logs into weekly template
claude-memory monthly --synthesize # Pull weekly notes into monthly template
claude-memory quarterly --synthesize # Pull monthly notes into quarterly template
claude-memory yearly --synthesize # Pull quarterly notes into yearly template
# Append notes to a period
claude-memory weekly --append "Shipped the new auth flow"
claude-memory monthly --append "Promoted to senior engineer"
# List all notes for a period
claude-memory weekly --list
claude-memory monthly --list
# Load current week + month context (for session boot)
claude-memory periodic --load
# Show all current period notes
claude-memory periodic --all
The synthesis workflow: run --synthesize to collect source material into the template, then edit the template sections to create a proper summary. The raw source material is appended below a separator for reference.
Wiring Into Claude Code
Add this to your project's CLAUDE.md (or ~/.claude/CLAUDE.md for global use):
# Memory System
On startup, load recent context:
claude-memory daily --load
Before answering questions about the project, search memory:
claude-memory search "relevant query" --json
Log significant events:
claude-memory daily --append "description of what happened"
See examples/claude_code_setup.md for a complete integration guide.
Configuration Reference
All settings live in ~/.claude-memory/config.json. Every value has a sensible default, so you only need to set what you want to change.
| Key | Default | Description |
|---|---|---|
index_paths |
[] |
Directories to index: [{"path": "...", "depth": 3}] |
index_extensions |
[".md"] |
File extensions to index |
skip_dirs |
["chromadb", ".git", ...] |
Directories to skip during indexing |
chunk_size_words |
400 |
Words per chunk (target) |
chunk_overlap_words |
80 |
Overlap between chunks |
vector_weight |
0.70 |
Weight for vector similarity in hybrid score |
keyword_weight |
0.30 |
Weight for keyword match in hybrid score |
decay_half_life_days |
30 |
Days until a file's score is halved by decay |
evergreen_patterns |
["README.md", ...] |
Path patterns exempt from decay |
categories |
["project", ...] |
Valid category names |
category_rules |
[{"pattern": ..., "category": ...}] |
Path-to-category mapping rules |
weekly_dir |
data_dir/weekly |
Weekly notes directory |
monthly_dir |
data_dir/monthly |
Monthly notes directory |
quarterly_dir |
data_dir/quarterly |
Quarterly notes directory |
yearly_dir |
data_dir/yearly |
Yearly notes directory |
collection_name |
"claude_memory" |
ChromaDB collection name |
default_n_results |
8 |
Default number of search results |
salience_patterns |
{"WELCOME.md": 5.0, ...} |
Path pattern -> salience multiplier |
retrieval_strengthening.boost_per_access |
0.05 |
Score boost per recent search hit |
retrieval_strengthening.window_days |
7 |
Only count accesses within N days |
retrieval_strengthening.max_boost |
0.30 |
Maximum retrieval boost |
consolidation.bonus_per_level |
0.15 |
Bonus per periodic note mention level |
consolidation.max_bonus |
0.50 |
Maximum consolidation bonus |
Environment Variables
Any config key can be overridden with CLAUDE_MEMORY_<KEY> (uppercase). Key path overrides:
CLAUDE_MEMORY_CONFIG-- Path to config fileCLAUDE_MEMORY_DATA_DIR-- Data directoryCLAUDE_MEMORY_CHROMA_DIR-- ChromaDB directoryCLAUDE_MEMORY_DAILY_LOG_DIR-- Daily log directoryCLAUDE_MEMORY_WEEKLY_DIR-- Weekly notes directoryCLAUDE_MEMORY_MONTHLY_DIR-- Monthly notes directoryCLAUDE_MEMORY_QUARTERLY_DIR-- Quarterly notes directoryCLAUDE_MEMORY_YEARLY_DIR-- Yearly notes directory
Python API
from claude_memory import (
hybrid_search, delta_index, load_recent, get_config,
init_period, synthesize_period, load_periodic_context,
biological_score, record_access, show_access_stats,
)
config = get_config()
# Search
results = hybrid_search("deployment process", config=config)
for r in results:
print(f"{r['score']:.3f} {r['metadata']['filename']}")
# Index
delta_index(config=config)
# Daily logs
context = load_recent(config=config)
# Periodic notes
init_period("weekly", config=config)
synthesize_period("weekly", config=config)
periodic_context = load_periodic_context(config=config)
# Biological model
score = biological_score(0.85, "/path/to/file.md", config)
record_access("/path/to/file.md", config) # Track retrieval
show_access_stats(config) # Display stats
Optional Modules
Three optional modules extend the core memory system with additional capabilities. They have no extra dependencies beyond what's already required.
Compaction Watcher
Claude Code periodically compresses its conversation context when the window fills up. This is the #1 cause of AI amnesia -- your assistant suddenly forgets everything you discussed. The compaction watcher monitors for these events so you can track when they happen and correlate them with lost context.
claude-memory compaction --watch # Monitor in real-time (blocking)
claude-memory compaction --scan # Scan past conversations for events
claude-memory compaction --stats # Show compaction statistics
claude-memory compaction --log # Show the event log
The watcher detects compaction by monitoring Claude Code's JSONL conversation files for significant size drops (>20%). Each event is logged with timestamp, project, and size delta.
from claude_memory.extras.compaction_watcher import (
watch_compactions, scan_for_compactions, get_compaction_stats,
)
# Scan for past events
events = scan_for_compactions()
# Watch in real-time with a callback
def on_compaction(event):
print(f"Compaction detected! {event['drop_percent']}% size drop")
watch_compactions(interval=30, callback=on_compaction)
Context Compression Language (CCL)
CCL is a shorthand notation system that compresses natural language instructions into a compact format that LLMs can still interpret. Useful for system prompts, CLAUDE.md files, and any context that gets loaded into every conversation.
Four compression tiers:
- T0 -- Full English (no compression)
- T1 -- Light: abbreviations, dropped articles (30-40% savings)
- T2 -- Medium: symbols, arrows, compressed structure (50-60% savings)
- T3 -- Heavy: maximum density, telegraphic (65-75% savings)
claude-memory ccl --analyze CLAUDE.md # Show savings per tier
claude-memory ccl --encode CLAUDE.md --tier 2 # Compress to T2
claude-memory ccl --encode CLAUDE.md -o out.md # Save compressed output
claude-memory ccl --decode compressed.md # Expand back to English
claude-memory ccl --dict # Show compression dictionary
The key insight: LLMs are trained on enough internet text that they can decode most common abbreviations. "wr->VLT" is unambiguous to Claude when "vault" is defined earlier in the context.
from claude_memory.extras.ccl import encode, decode, analyze
# Compress text
original = "Always write important decisions to the vault immediately."
compressed = encode(original, tier=2)
# Result: "@ wr [!] decisions->VLT NOW."
# Analyze savings
stats = analyze(original, tier=2)
print(f"Token savings: {stats['token_savings_percent']}%")
Vault Sync
Multi-machine memory synchronization via cloud folders (OneDrive, Dropbox, Google Drive, iCloud). Designate a cloud-synced folder as your "vault" -- the shared source of truth between machines.
claude-memory vault --init ~/OneDrive/claude-vault # Set vault path
claude-memory vault --status # Compare local vs vault
claude-memory vault --push # Push local to vault
claude-memory vault --pull # Pull vault to local
claude-memory vault --sync # Bidirectional (newer wins)
claude-memory vault --sync --dry-run # Preview without changes
The sync covers daily logs, weekly/monthly/quarterly/yearly notes, and metadata files. Strategy: newer file wins for conflicts, new files copy to the other side.
from claude_memory.extras.vault_sync import (
init_vault, sync_to_vault, sync_from_vault,
bidirectional_sync, get_sync_status, append_to_changelog,
)
# Initialize vault
init_vault("~/OneDrive/claude-vault")
# Check status
status = get_sync_status()
print(f"Files to push: {len(status['local_only'])}")
print(f"Files to pull: {len(status['vault_only'])}")
# Sync
bidirectional_sync()
# Log to vault changelog
append_to_changelog("Shipped new auth flow")
The Backstory: Convergent Evolution
This project started as a personal filing system -- a way to organize notes, research, and project context so that each new Claude Code session could pick up where the last one left off. The architecture was simple: structured markdown files, a vector index for semantic search, and daily logs for session continuity.
Around the same time, the OpenClaw team was building their own memory system for AI agents, eventually formalized in their ACP (Agent Context Protocol) specification. When we discovered their work, the overlap was striking. Both systems independently arrived at the same core ideas:
- Chunked vector indexing with ~400-token windows and overlap
- Hybrid search combining embeddings with keyword matching
- Temporal decay to prioritize recent context
- Evergreen exemptions for foundational documents
- Delta sync to avoid re-indexing unchanged files
This kind of convergent evolution is a strong signal. When two independent efforts solve the same problem the same way, the solution is probably close to correct. We weren't copying OpenClaw -- we didn't know about their work until after building the first version. But seeing the convergence gave us confidence to formalize what we had into a reusable tool.
The key insight both systems share: AI assistants need memory that degrades gracefully. Not everything should persist equally. Yesterday's debugging session matters more than last month's. But your architecture decisions? Those should never fade. The combination of temporal decay with evergreen exemptions captures this naturally.
Design Decisions
Why ChromaDB? It's the simplest embedded vector database that actually works. No server to run, no infrastructure to manage. It stores everything in a local directory. For a memory system that needs to run on a developer's laptop alongside their code, embedded is the right call.
Why hybrid search instead of pure vector? Embeddings are great at semantic similarity but sometimes miss exact keyword matches. If you search for "PostgreSQL" and your notes say "PostgreSQL," pure vector search might rank a document about "database systems" higher. The keyword component (BM25-style term frequency with saturation) catches these exact matches and boosts them.
Why temporal decay? Without it, your search results are dominated by large, old documents that have accumulated lots of indexed chunks. Decay naturally pushes recent, relevant context to the top while keeping old knowledge accessible if it's specifically queried.
Why delta sync? A full re-index of a large vault can take minutes. With SHA-256 hash comparison, a no-change run completes in under a second. This makes it practical to run the indexer on every boot or even on a file-watch trigger.
Why CP1252 safe_print? Windows consoles using CP1252 encoding will crash on Unicode characters that Python's print() tries to output. This is a real problem when indexing markdown files that contain emoji, special characters, or non-Latin text. The safe_print wrapper catches these encoding errors and replaces problematic characters instead of crashing. Every Windows developer has hit this bug; now you don't have to.
Contributing
This is a small, focused tool. If you're using it and find a bug or have an improvement, open an issue or PR. The codebase is intentionally simple -- four Python files, one dependency.
Areas where contributions would be welcome:
- Support for additional file formats beyond markdown
- Smarter chunking (heading-aware, code-block-aware)
- Watch mode for automatic re-indexing on file changes
- Integration examples for other AI assistants (Cursor, Copilot, etc.)
License
MIT -- do whatever you want with it.
Credits
Built by Sean (Haustorium12) and Claude (Anthropic).
Inspired by the convergent evolution between our filing cabinet architecture and the OpenClaw memory system, formalized in the ACP specification.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file claude_memory_hx-0.3.0.tar.gz.
File metadata
- Download URL: claude_memory_hx-0.3.0.tar.gz
- Upload date:
- Size: 53.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
25f3186ffe58b235f69217a8345817786c2bfd9643d04cbb4ea949b214372d48
|
|
| MD5 |
a28f89048936610a7c8e1153761579a8
|
|
| BLAKE2b-256 |
25045293df2555dc510237a5597f381c6e2d982f205a1db4535ee21e77fc9ffa
|
File details
Details for the file claude_memory_hx-0.3.0-py3-none-any.whl.
File metadata
- Download URL: claude_memory_hx-0.3.0-py3-none-any.whl
- Upload date:
- Size: 54.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12a3bcc21540dcfa71f4293162ea2035380fcb66df74b71806cc9a6e6c628285
|
|
| MD5 |
ac81cdc14c74c8a21c873bac7716e4ee
|
|
| BLAKE2b-256 |
46070c0c0fbaf96187092173b29eddb871daa192389c809c81ce94949571aaf4
|