Skip to main content

Persistent hybrid memory for AI coding assistants

Project description

claude-memory

Persistent hybrid memory for AI coding assistants.

License: MIT Python 3.9+

AI coding assistants forget everything between sessions. You explain your architecture on Monday, and by Tuesday it's asking what framework you use. claude-memory fixes this by giving your assistant a persistent, searchable memory that loads the right context automatically.

It combines vector similarity search (ChromaDB) with keyword matching (BM25-style scoring) and a biological memory model -- five mechanisms inspired by how human memory actually works: temporal decay, evergreen exemptions, salience weighting, retrieval strengthening, and consolidation. A delta-sync indexer uses SHA-256 hashes to skip unchanged files, making re-indexing fast enough to run on every boot. Daily logs and periodic notes (weekly/monthly/quarterly/yearly) give each new session immediate context about what happened recently.

How It Works

                         +------------------+
                         |   Your Notes /   |
                         |   Docs / Vault   |
                         +--------+---------+
                                  |
                          claude-memory index
                          (delta sync: SHA-256)
                                  |
                         +--------v---------+
                         |     ChromaDB     |
                         |  (vector store)  |
                         +--------+---------+
                                  |
                         claude-memory search
                         (hybrid: vector + BM25)
                                  |
                    +-------------+-------------+
                    |                           |
             Vector Similarity           Keyword Match
             (semantic meaning)          (exact terms)
                    |                           |
                    +-------------+-------------+
                                  |
                         Temporal Decay
                      (recent = higher rank,
                       evergreen = no decay)
                                  |
                         +--------v---------+
                         |  Ranked Results  |
                         |  (JSON or human  |
                         |   readable)      |
                         +------------------+

The indexer watches your markdown files. When something changes, it re-chunks and re-embeds only the changed files. Search combines two signals -- vector similarity catches semantic matches ("deployment process" finds "how to ship to production") while keyword scoring catches exact terms that embeddings sometimes miss.

The Biological Memory Model

The scoring pipeline applies five mechanisms modeled after biological memory:

  1. Temporal Decay (Ebbinghaus forgetting curve) -- Exponential penalty based on file age. Half-life defaults to 30 days: a month-old file scores 50% of a fresh one.

  2. Evergreen Exemptions (neocortical long-term storage) -- Architecture docs, decision records, and config files are exempt from decay. They always rank at full strength.

  3. Salience Weighting (amygdala tagging) -- Important files decay slower. A file with salience 5x effectively ages 5x slower. BRIEFING.md at salience 5.0 means a 150-day-old file decays like it's 30 days old.

  4. Retrieval Strengthening (spaced repetition) -- Every time a file appears in search results, it gets a 5% boost (capped at 30%). Files you keep coming back to stay accessible -- just like memories that get reinforced through recall.

  5. Consolidation Bonus (hippocampal replay) -- Files mentioned in periodic notes (weekly, monthly, quarterly, yearly) get a 15% boost per level, capped at 50%. If something makes it into your quarterly review, it's clearly important.

The combined formula: score = base * decay(effective_age) * (1 + retrieval_boost) * (1 + consolidation_bonus) where effective_age = raw_age / salience.

Quick Start

Install

pip install claude-memory

Or install from source:

git clone https://github.com/Haustorium12/claude-memory.git
cd claude-memory
pip install -e .

Initialize

claude-memory init

This creates ~/.claude-memory/config.json and the data directories.

Configure Your Paths

Edit ~/.claude-memory/config.json and add the directories you want indexed:

{
  "index_paths": [
    {"path": "/home/you/notes", "depth": 3},
    {"path": "/home/you/projects/docs", "depth": 2}
  ]
}

Index

claude-memory index          # Delta sync (fast -- only changed files)
claude-memory index --full   # Full rebuild from scratch
claude-memory index --check  # Dry run -- show what would change
claude-memory index --stats  # Show index statistics

Search

claude-memory search "deployment process"
claude-memory search "auth bug" -c project
claude-memory search "meeting notes" -n 5
claude-memory search "api design" --json          # For piping to other tools
claude-memory search "architecture" --no-decay    # Ignore temporal decay
claude-memory search "database" --vector-only     # Pure semantic search
claude-memory search "TODO" --keyword-only        # Pure keyword search
claude-memory access-stats                         # Show retrieval & consolidation stats

Daily Logs

claude-memory daily --init                     # Create today's log
claude-memory daily --append "Fixed auth bug"  # Log an event
claude-memory daily --load                     # Show today + yesterday
claude-memory daily --today                    # Show today only

Periodic Notes

Periodic notes synthesize daily logs into higher-level summaries: weekly, monthly, quarterly, and yearly. Each level pulls from the one below it -- weekly notes pull from daily logs, monthly from weekly, and so on.

# Show/create current period notes
claude-memory weekly                      # This week's note
claude-memory monthly                     # This month's note
claude-memory quarterly                   # This quarter's note
claude-memory yearly                      # This year's note

# Synthesize from lower-level notes
claude-memory weekly --synthesize         # Pull daily logs into weekly template
claude-memory monthly --synthesize        # Pull weekly notes into monthly template
claude-memory quarterly --synthesize      # Pull monthly notes into quarterly template
claude-memory yearly --synthesize         # Pull quarterly notes into yearly template

# Append notes to a period
claude-memory weekly --append "Shipped the new auth flow"
claude-memory monthly --append "Promoted to senior engineer"

# List all notes for a period
claude-memory weekly --list
claude-memory monthly --list

# Load current week + month context (for session boot)
claude-memory periodic --load

# Show all current period notes
claude-memory periodic --all

The synthesis workflow: run --synthesize to collect source material into the template, then edit the template sections to create a proper summary. The raw source material is appended below a separator for reference.

Wiring Into Claude Code

Add this to your project's CLAUDE.md (or ~/.claude/CLAUDE.md for global use):

# Memory System

On startup, load recent context:
  claude-memory daily --load

Before answering questions about the project, search memory:
  claude-memory search "relevant query" --json

Log significant events:
  claude-memory daily --append "description of what happened"

See examples/claude_code_setup.md for a complete integration guide.

Configuration Reference

All settings live in ~/.claude-memory/config.json. Every value has a sensible default, so you only need to set what you want to change.

Key Default Description
index_paths [] Directories to index: [{"path": "...", "depth": 3}]
index_extensions [".md"] File extensions to index
skip_dirs ["chromadb", ".git", ...] Directories to skip during indexing
chunk_size_words 400 Words per chunk (target)
chunk_overlap_words 80 Overlap between chunks
vector_weight 0.70 Weight for vector similarity in hybrid score
keyword_weight 0.30 Weight for keyword match in hybrid score
decay_half_life_days 30 Days until a file's score is halved by decay
evergreen_patterns ["README.md", ...] Path patterns exempt from decay
categories ["project", ...] Valid category names
category_rules [{"pattern": ..., "category": ...}] Path-to-category mapping rules
weekly_dir data_dir/weekly Weekly notes directory
monthly_dir data_dir/monthly Monthly notes directory
quarterly_dir data_dir/quarterly Quarterly notes directory
yearly_dir data_dir/yearly Yearly notes directory
collection_name "claude_memory" ChromaDB collection name
default_n_results 8 Default number of search results
salience_patterns {"WELCOME.md": 5.0, ...} Path pattern -> salience multiplier
retrieval_strengthening.boost_per_access 0.05 Score boost per recent search hit
retrieval_strengthening.window_days 7 Only count accesses within N days
retrieval_strengthening.max_boost 0.30 Maximum retrieval boost
consolidation.bonus_per_level 0.15 Bonus per periodic note mention level
consolidation.max_bonus 0.50 Maximum consolidation bonus

Environment Variables

Any config key can be overridden with CLAUDE_MEMORY_<KEY> (uppercase). Key path overrides:

  • CLAUDE_MEMORY_CONFIG -- Path to config file
  • CLAUDE_MEMORY_DATA_DIR -- Data directory
  • CLAUDE_MEMORY_CHROMA_DIR -- ChromaDB directory
  • CLAUDE_MEMORY_DAILY_LOG_DIR -- Daily log directory
  • CLAUDE_MEMORY_WEEKLY_DIR -- Weekly notes directory
  • CLAUDE_MEMORY_MONTHLY_DIR -- Monthly notes directory
  • CLAUDE_MEMORY_QUARTERLY_DIR -- Quarterly notes directory
  • CLAUDE_MEMORY_YEARLY_DIR -- Yearly notes directory

Python API

from claude_memory import (
    hybrid_search, delta_index, load_recent, get_config,
    init_period, synthesize_period, load_periodic_context,
    biological_score, record_access, show_access_stats,
)

config = get_config()

# Search
results = hybrid_search("deployment process", config=config)
for r in results:
    print(f"{r['score']:.3f} {r['metadata']['filename']}")

# Index
delta_index(config=config)

# Daily logs
context = load_recent(config=config)

# Periodic notes
init_period("weekly", config=config)
synthesize_period("weekly", config=config)
periodic_context = load_periodic_context(config=config)

# Biological model
score = biological_score(0.85, "/path/to/file.md", config)
record_access("/path/to/file.md", config)  # Track retrieval
show_access_stats(config)                   # Display stats

Optional Modules

Three optional modules extend the core memory system with additional capabilities. They have no extra dependencies beyond what's already required.

Compaction Watcher

Claude Code periodically compresses its conversation context when the window fills up. This is the #1 cause of AI amnesia -- your assistant suddenly forgets everything you discussed. The compaction watcher monitors for these events so you can track when they happen and correlate them with lost context.

claude-memory compaction --watch          # Monitor in real-time (blocking)
claude-memory compaction --scan           # Scan past conversations for events
claude-memory compaction --stats          # Show compaction statistics
claude-memory compaction --log            # Show the event log

The watcher detects compaction by monitoring Claude Code's JSONL conversation files for significant size drops (>20%). Each event is logged with timestamp, project, and size delta.

from claude_memory.extras.compaction_watcher import (
    watch_compactions, scan_for_compactions, get_compaction_stats,
)

# Scan for past events
events = scan_for_compactions()

# Watch in real-time with a callback
def on_compaction(event):
    print(f"Compaction detected! {event['drop_percent']}% size drop")

watch_compactions(interval=30, callback=on_compaction)

Context Compression Language (CCL)

CCL is a shorthand notation system that compresses natural language instructions into a compact format that LLMs can still interpret. Useful for system prompts, CLAUDE.md files, and any context that gets loaded into every conversation.

Four compression tiers:

  • T0 -- Full English (no compression)
  • T1 -- Light: abbreviations, dropped articles (30-40% savings)
  • T2 -- Medium: symbols, arrows, compressed structure (50-60% savings)
  • T3 -- Heavy: maximum density, telegraphic (65-75% savings)
claude-memory ccl --analyze CLAUDE.md             # Show savings per tier
claude-memory ccl --encode CLAUDE.md --tier 2     # Compress to T2
claude-memory ccl --encode CLAUDE.md -o out.md    # Save compressed output
claude-memory ccl --decode compressed.md          # Expand back to English
claude-memory ccl --dict                          # Show compression dictionary

The key insight: LLMs are trained on enough internet text that they can decode most common abbreviations. "wr->VLT" is unambiguous to Claude when "vault" is defined earlier in the context.

from claude_memory.extras.ccl import encode, decode, analyze

# Compress text
original = "Always write important decisions to the vault immediately."
compressed = encode(original, tier=2)
# Result: "@ wr [!] decisions->VLT NOW."

# Analyze savings
stats = analyze(original, tier=2)
print(f"Token savings: {stats['token_savings_percent']}%")

Vault Sync

Multi-machine memory synchronization via cloud folders (OneDrive, Dropbox, Google Drive, iCloud). Designate a cloud-synced folder as your "vault" -- the shared source of truth between machines.

claude-memory vault --init ~/OneDrive/claude-vault  # Set vault path
claude-memory vault --status                         # Compare local vs vault
claude-memory vault --push                           # Push local to vault
claude-memory vault --pull                           # Pull vault to local
claude-memory vault --sync                           # Bidirectional (newer wins)
claude-memory vault --sync --dry-run                 # Preview without changes

The sync covers daily logs, weekly/monthly/quarterly/yearly notes, and metadata files. Strategy: newer file wins for conflicts, new files copy to the other side.

from claude_memory.extras.vault_sync import (
    init_vault, sync_to_vault, sync_from_vault,
    bidirectional_sync, get_sync_status, append_to_changelog,
)

# Initialize vault
init_vault("~/OneDrive/claude-vault")

# Check status
status = get_sync_status()
print(f"Files to push: {len(status['local_only'])}")
print(f"Files to pull: {len(status['vault_only'])}")

# Sync
bidirectional_sync()

# Log to vault changelog
append_to_changelog("Shipped new auth flow")

The Backstory: Convergent Evolution

This project started as a personal filing system -- a way to organize notes, research, and project context so that each new Claude Code session could pick up where the last one left off. The architecture was simple: structured markdown files, a vector index for semantic search, and daily logs for session continuity.

Around the same time, the OpenClaw team was building their own memory system for AI agents, eventually formalized in their ACP (Agent Context Protocol) specification. When we discovered their work, the overlap was striking. Both systems independently arrived at the same core ideas:

  • Chunked vector indexing with ~400-token windows and overlap
  • Hybrid search combining embeddings with keyword matching
  • Temporal decay to prioritize recent context
  • Evergreen exemptions for foundational documents
  • Delta sync to avoid re-indexing unchanged files

This kind of convergent evolution is a strong signal. When two independent efforts solve the same problem the same way, the solution is probably close to correct. We weren't copying OpenClaw -- we didn't know about their work until after building the first version. But seeing the convergence gave us confidence to formalize what we had into a reusable tool.

The key insight both systems share: AI assistants need memory that degrades gracefully. Not everything should persist equally. Yesterday's debugging session matters more than last month's. But your architecture decisions? Those should never fade. The combination of temporal decay with evergreen exemptions captures this naturally.

Design Decisions

Why ChromaDB? It's the simplest embedded vector database that actually works. No server to run, no infrastructure to manage. It stores everything in a local directory. For a memory system that needs to run on a developer's laptop alongside their code, embedded is the right call.

Why hybrid search instead of pure vector? Embeddings are great at semantic similarity but sometimes miss exact keyword matches. If you search for "PostgreSQL" and your notes say "PostgreSQL," pure vector search might rank a document about "database systems" higher. The keyword component (BM25-style term frequency with saturation) catches these exact matches and boosts them.

Why temporal decay? Without it, your search results are dominated by large, old documents that have accumulated lots of indexed chunks. Decay naturally pushes recent, relevant context to the top while keeping old knowledge accessible if it's specifically queried.

Why delta sync? A full re-index of a large vault can take minutes. With SHA-256 hash comparison, a no-change run completes in under a second. This makes it practical to run the indexer on every boot or even on a file-watch trigger.

Why CP1252 safe_print? Windows consoles using CP1252 encoding will crash on Unicode characters that Python's print() tries to output. This is a real problem when indexing markdown files that contain emoji, special characters, or non-Latin text. The safe_print wrapper catches these encoding errors and replaces problematic characters instead of crashing. Every Windows developer has hit this bug; now you don't have to.

Contributing

This is a small, focused tool. If you're using it and find a bug or have an improvement, open an issue or PR. The codebase is intentionally simple -- four Python files, one dependency.

Areas where contributions would be welcome:

  • Support for additional file formats beyond markdown
  • Smarter chunking (heading-aware, code-block-aware)
  • Watch mode for automatic re-indexing on file changes
  • Integration examples for other AI assistants (Cursor, Copilot, etc.)

License

MIT -- do whatever you want with it.

Credits

Built by Sean (Haustorium12) and Claude (Anthropic).

Inspired by the convergent evolution between our filing cabinet architecture and the OpenClaw memory system, formalized in the ACP specification.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claude_memory_hx-0.3.0.tar.gz (53.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

claude_memory_hx-0.3.0-py3-none-any.whl (54.2 kB view details)

Uploaded Python 3

File details

Details for the file claude_memory_hx-0.3.0.tar.gz.

File metadata

  • Download URL: claude_memory_hx-0.3.0.tar.gz
  • Upload date:
  • Size: 53.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for claude_memory_hx-0.3.0.tar.gz
Algorithm Hash digest
SHA256 25f3186ffe58b235f69217a8345817786c2bfd9643d04cbb4ea949b214372d48
MD5 a28f89048936610a7c8e1153761579a8
BLAKE2b-256 25045293df2555dc510237a5597f381c6e2d982f205a1db4535ee21e77fc9ffa

See more details on using hashes here.

File details

Details for the file claude_memory_hx-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for claude_memory_hx-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 12a3bcc21540dcfa71f4293162ea2035380fcb66df74b71806cc9a6e6c628285
MD5 ac81cdc14c74c8a21c873bac7716e4ee
BLAKE2b-256 46070c0c0fbaf96187092173b29eddb871daa192389c809c81ce94949571aaf4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page