Skip to main content

Local-first persistent memory for AI agents — store, recall, and consolidate knowledge across sessions using FAISS, SQLite, and any LLM

Project description

consolidation-memory

PyPI CI Python License

Local-first persistent memory for AI agents. SQLite + FAISS, runs on a laptop, no cloud.

Agents store episodes (conversations, facts, solutions). A background thread periodically clusters related episodes and uses a local LLM to synthesize them into structured knowledge records. Old episodes get pruned. Knowledge compounds over time instead of degrading.

Install

pip install consolidation-memory[fastembed]
consolidation-memory init
consolidation-memory setup-claude  # Add memory instructions to CLAUDE.md

FastEmbed runs locally. No API keys needed. The setup-claude command adds instructions to your ~/.claude/CLAUDE.md so Claude Code proactively uses memory tools.

MCP Server

{
  "mcpServers": {
    "consolidation_memory": {
      "command": "consolidation-memory"
    }
  }
}

Tools: memory_store, memory_store_batch, memory_recall, memory_search, memory_status, memory_forget, memory_export, memory_correct, memory_compact, memory_consolidate, memory_browse, memory_read_topic, memory_timeline, memory_decay_report, memory_protect

Python API

from consolidation_memory import MemoryClient

with MemoryClient() as mem:
    mem.store("User prefers dark mode", content_type="preference", tags=["ui"])

    result = mem.recall("user interface preferences")
    for ep in result.episodes:
        print(ep["content"], ep["similarity"])

OpenAI Function Calling

Works with any OpenAI-compatible API (LM Studio, Ollama, OpenAI, Azure):

from consolidation_memory import MemoryClient
from consolidation_memory.schemas import openai_tools, dispatch_tool_call

mem = MemoryClient()
# Pass openai_tools to your chat completion, dispatch results with dispatch_tool_call()

REST API

pip install consolidation-memory[rest]
consolidation-memory serve --rest --port 8080

POST /memory/store | POST /memory/store/batch | POST /memory/recall | POST /memory/search | GET /memory/status | DELETE /memory/episodes/{id} | POST /memory/consolidate | POST /memory/correct | POST /memory/export | POST /memory/compact | GET /memory/browse | GET /memory/topics/{filename} | POST /memory/timeline | POST /memory/contradictions | POST /memory/protect | GET /memory/decay-report | GET /health

How Consolidation Works

store episodes → SQLite + FAISS
                      ↓
        background thread (every 6h)
                      ↓
     hierarchical clustering by similarity
                      ↓
        LLM synthesizes knowledge records
        (facts, solutions, preferences, procedures)
                      ↓
     records feed back into recall, old episodes pruned

Episodes are grouped by semantic similarity using agglomerative clustering. Each cluster is matched against existing knowledge topics. The LLM either creates a new topic or merges into an existing one. Output is validated, versioned, and written as structured records with their own embeddings for independent search.

Three consecutive LLM failures trip a circuit breaker. Pruned episodes still count toward consolidation history.

Backends

Embedding

Backend Install Model Local
FastEmbed (default) pip install consolidation-memory[fastembed] bge-small-en-v1.5 Y
LM Studio Built-in nomic-embed-text-v1.5 Y
Ollama Built-in nomic-embed-text Y
OpenAI pip install consolidation-memory[openai] text-embedding-3-small N

LLM (for consolidation)

Backend Requirements
LM Studio (default) LM Studio running with any chat model
Ollama Ollama running with any chat model
OpenAI API key
Disabled None — no consolidation, pure vector search

Configuration

consolidation-memory init
Manual config
Platform Path
Linux/macOS ~/.config/consolidation_memory/config.toml
Windows %APPDATA%\consolidation_memory\config.toml
Override CONSOLIDATION_MEMORY_CONFIG env var
[embedding]
backend = "fastembed"

[llm]
backend = "lmstudio"
api_base = "http://localhost:1234/v1"
model = "qwen2.5-7b-instruct"

[consolidation]
auto_run = true
interval_hours = 6
cluster_threshold = 0.72  # default: 0.78
prune_enabled = true
prune_after_days = 60  # default: 30
Environment variable overrides

Every setting can be overridden with CONSOLIDATION_MEMORY_<FIELD_NAME>:

CONSOLIDATION_MEMORY_EMBEDDING_BACKEND=lmstudio
CONSOLIDATION_MEMORY_EMBEDDING_DIMENSION=768
CONSOLIDATION_MEMORY_LLM_BACKEND=openai
CONSOLIDATION_MEMORY_LLM_API_KEY=sk-...
CONSOLIDATION_MEMORY_CONSOLIDATION_INTERVAL_HOURS=12
CONSOLIDATION_MEMORY_CONSOLIDATION_AUTO_RUN=false

Priority: defaults < TOML < env vars < reset_config() (tests).

CLI

Command Description
consolidation-memory serve Start MCP server (default)
consolidation-memory serve --rest Start REST API
consolidation-memory --project work serve MCP server for a specific project
consolidation-memory init Interactive setup
consolidation-memory status Show stats
consolidation-memory consolidate Manual consolidation
consolidation-memory export Export to JSON
consolidation-memory import PATH Import from JSON
consolidation-memory reindex Re-embed everything (after switching backends)
consolidation-memory browse Browse knowledge topics
consolidation-memory setup-claude Add memory instructions to CLAUDE.md
consolidation-memory test Post-install verification
consolidation-memory dashboard TUI dashboard

Multi-Project

Isolate memories per project:

consolidation-memory --project work status
CONSOLIDATION_MEMORY_PROJECT=work consolidation-memory serve

MCP config for multiple projects:

{
  "mcpServers": {
    "memory-work": {
      "command": "consolidation-memory",
      "env": { "CONSOLIDATION_MEMORY_PROJECT": "work" }
    },
    "memory-personal": {
      "command": "consolidation-memory",
      "env": { "CONSOLIDATION_MEMORY_PROJECT": "personal" }
    }
  }
}

Each project gets its own database, vector index, and knowledge files.

Cross-Client Memory

One consolidation-memory instance serves every MCP client on your machine. Claude Code, Cursor, Windsurf, VS Code + Continue — all share the same SQLite database and FAISS index. A fact stored from Cursor is recalled in Claude Code. No cloud sync needed.

This is the local-first alternative to cloud-based memory passports. Your data never leaves your machine.

Example configs for each client

Claude Code (claude_desktop_config.json):

{
  "mcpServers": {
    "consolidation_memory": {
      "command": "consolidation-memory"
    }
  }
}

Cursor (.cursor/mcp.json):

{
  "mcpServers": {
    "consolidation_memory": {
      "command": "consolidation-memory"
    }
  }
}

VS Code + Continue (.continue/config.json):

{
  "mcpServers": [
    {
      "name": "consolidation_memory",
      "command": "consolidation-memory"
    }
  ]
}

Generic MCP client (any client supporting stdio transport):

{
  "command": "consolidation-memory",
  "transport": "stdio"
}

All configs above point at the default data directory. To share memories across clients with a specific project:

{
  "command": "consolidation-memory",
  "env": { "CONSOLIDATION_MEMORY_PROJECT": "my-project" }
}

Every client using the same project name reads and writes to the same database.

Data Storage

All data stays local.

Platform Path
Linux ~/.local/share/consolidation_memory/projects/<name>/
macOS ~/Library/Application Support/consolidation_memory/projects/<name>/
Windows %LOCALAPPDATA%\consolidation_memory\projects\<name>\

Switching embedding backends? consolidation-memory reindex

Roadmap

  • Hybrid search (BM25 + semantic fusion)
  • Diff-aware merge validation for consolidation
  • Query expansion for short/ambiguous recalls
  • Recall result deduplication
  • Entity extraction and relationship graph
  • Entity-aware recall boosting
  • First-party plugins (git history, project context, Obsidian export)

Development

git clone https://github.com/charliee1w/consolidation-memory
cd consolidation-memory
pip install -e ".[all,dev]"
pytest tests/ -v
ruff check src/ tests/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

consolidation_memory-0.12.0.tar.gz (134.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

consolidation_memory-0.12.0-py3-none-any.whl (107.6 kB view details)

Uploaded Python 3

File details

Details for the file consolidation_memory-0.12.0.tar.gz.

File metadata

  • Download URL: consolidation_memory-0.12.0.tar.gz
  • Upload date:
  • Size: 134.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for consolidation_memory-0.12.0.tar.gz
Algorithm Hash digest
SHA256 539b9f7cd9701dab2e75071c9d48e4a8e6c4eeb9798283fbbe80d0c676f89577
MD5 3bcc4bfb89c3e2631b86851500f0faa9
BLAKE2b-256 5f2644b283d8000fbd6a80a59d01a8b5fcd507d60fa6062d2ccaf9b0a185214e

See more details on using hashes here.

Provenance

The following attestation bundles were made for consolidation_memory-0.12.0.tar.gz:

Publisher: publish.yml on charliee1w/consolidation-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file consolidation_memory-0.12.0-py3-none-any.whl.

File metadata

File hashes

Hashes for consolidation_memory-0.12.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5ee89d53e00e8ba52b7834689cd6da2cc46a19d2d0a2594ee97508c65ba85e12
MD5 89569edd9f471ecfb3e027afff4eab9f
BLAKE2b-256 dcd2694fbd8d06073e63e02c76541083cc7156fcdcbb325481cb2895fe5b1989

See more details on using hashes here.

Provenance

The following attestation bundles were made for consolidation_memory-0.12.0-py3-none-any.whl:

Publisher: publish.yml on charliee1w/consolidation-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page