Skip to main content

Local-first persistent memory for AI agents — store, recall, and consolidate knowledge across sessions using FAISS, SQLite, and any LLM

Project description

consolidation-memory

PyPI CI Python License

Local-first persistent memory for AI agents. SQLite + FAISS, runs on a laptop, no cloud.

Agents store episodes (conversations, facts, solutions). A background thread periodically clusters related episodes and uses a local LLM to synthesize them into structured knowledge records. Old episodes get pruned. Knowledge compounds over time instead of degrading.

Install

pip install consolidation-memory[fastembed]
consolidation-memory init
consolidation-memory setup-claude  # Add memory instructions to CLAUDE.md

FastEmbed runs locally. No API keys needed. The setup-claude command adds instructions to your ~/.claude/CLAUDE.md so Claude Code proactively uses memory tools.

MCP Server

{
  "mcpServers": {
    "consolidation_memory": {
      "command": "consolidation-memory"
    }
  }
}

Tools: memory_store, memory_store_batch, memory_recall, memory_search, memory_status, memory_forget, memory_export, memory_correct, memory_compact, memory_consolidate, memory_browse, memory_read_topic, memory_timeline, memory_decay_report, memory_protect

Python API

from consolidation_memory import MemoryClient

with MemoryClient() as mem:
    mem.store("User prefers dark mode", content_type="preference", tags=["ui"])

    result = mem.recall("user interface preferences")
    for ep in result.episodes:
        print(ep["content"], ep["similarity"])

OpenAI Function Calling

Works with any OpenAI-compatible API (LM Studio, Ollama, OpenAI, Azure):

from consolidation_memory import MemoryClient
from consolidation_memory.schemas import openai_tools, dispatch_tool_call

mem = MemoryClient()
# Pass openai_tools to your chat completion, dispatch results with dispatch_tool_call()

REST API

pip install consolidation-memory[rest]
consolidation-memory serve --rest --port 8080

POST /memory/store | POST /memory/store/batch | POST /memory/recall | POST /memory/search | GET /memory/status | DELETE /memory/episodes/{id} | POST /memory/consolidate | POST /memory/correct | POST /memory/export | POST /memory/compact | GET /memory/browse | GET /memory/topics/{filename} | POST /memory/timeline | POST /memory/contradictions | POST /memory/protect | GET /memory/decay-report | GET /health

How Consolidation Works

store episodes → SQLite + FAISS
                      ↓
        background thread (every 6h)
                      ↓
     hierarchical clustering by similarity
                      ↓
        LLM synthesizes knowledge records
        (facts, solutions, preferences, procedures)
                      ↓
     records feed back into recall, old episodes pruned

Episodes are grouped by semantic similarity using agglomerative clustering. Each cluster is matched against existing knowledge topics. The LLM either creates a new topic or merges into an existing one. Output is validated, versioned, and written as structured records with their own embeddings for independent search.

Three consecutive LLM failures trip a circuit breaker. Pruned episodes still count toward consolidation history.

Backends

Embedding

Backend Install Model Local
FastEmbed (default) pip install consolidation-memory[fastembed] bge-small-en-v1.5 Y
LM Studio Built-in nomic-embed-text-v1.5 Y
Ollama Built-in nomic-embed-text Y
OpenAI pip install consolidation-memory[openai] text-embedding-3-small N

LLM (for consolidation)

Backend Requirements
LM Studio (default) LM Studio running with any chat model
Ollama Ollama running with any chat model
OpenAI API key
Disabled None — no consolidation, pure vector search

Configuration

consolidation-memory init
Manual config
Platform Path
Linux/macOS ~/.config/consolidation_memory/config.toml
Windows %APPDATA%\consolidation_memory\config.toml
Override CONSOLIDATION_MEMORY_CONFIG env var
[embedding]
backend = "fastembed"

[llm]
backend = "lmstudio"
api_base = "http://localhost:1234/v1"
model = "qwen2.5-7b-instruct"

[consolidation]
auto_run = true
interval_hours = 6
cluster_threshold = 0.72  # default: 0.78
prune_enabled = true
prune_after_days = 60  # default: 30
Environment variable overrides

Every setting can be overridden with CONSOLIDATION_MEMORY_<FIELD_NAME>:

CONSOLIDATION_MEMORY_EMBEDDING_BACKEND=lmstudio
CONSOLIDATION_MEMORY_EMBEDDING_DIMENSION=768
CONSOLIDATION_MEMORY_LLM_BACKEND=openai
CONSOLIDATION_MEMORY_LLM_API_KEY=sk-...
CONSOLIDATION_MEMORY_CONSOLIDATION_INTERVAL_HOURS=12
CONSOLIDATION_MEMORY_CONSOLIDATION_AUTO_RUN=false

Priority: defaults < TOML < env vars < reset_config() (tests).

CLI

Command Description
consolidation-memory serve Start MCP server (default)
consolidation-memory serve --rest Start REST API
consolidation-memory --project work serve MCP server for a specific project
consolidation-memory init Interactive setup
consolidation-memory status Show stats
consolidation-memory consolidate Manual consolidation
consolidation-memory export Export to JSON
consolidation-memory import PATH Import from JSON
consolidation-memory reindex Re-embed everything (after switching backends)
consolidation-memory browse Browse knowledge topics
consolidation-memory setup-claude Add memory instructions to CLAUDE.md
consolidation-memory test Post-install verification
consolidation-memory dashboard TUI dashboard

Multi-Project

Isolate memories per project:

consolidation-memory --project work status
CONSOLIDATION_MEMORY_PROJECT=work consolidation-memory serve

MCP config for multiple projects:

{
  "mcpServers": {
    "memory-work": {
      "command": "consolidation-memory",
      "env": { "CONSOLIDATION_MEMORY_PROJECT": "work" }
    },
    "memory-personal": {
      "command": "consolidation-memory",
      "env": { "CONSOLIDATION_MEMORY_PROJECT": "personal" }
    }
  }
}

Each project gets its own database, vector index, and knowledge files.

Cross-Client Memory

One consolidation-memory instance serves every MCP client on your machine. Claude Code, Cursor, Windsurf, VS Code + Continue — all share the same SQLite database and FAISS index. A fact stored from Cursor is recalled in Claude Code. No cloud sync needed.

This is the local-first alternative to cloud-based memory passports. Your data never leaves your machine.

Example configs for each client

Claude Code (claude_desktop_config.json):

{
  "mcpServers": {
    "consolidation_memory": {
      "command": "consolidation-memory"
    }
  }
}

Cursor (.cursor/mcp.json):

{
  "mcpServers": {
    "consolidation_memory": {
      "command": "consolidation-memory"
    }
  }
}

VS Code + Continue (.continue/config.json):

{
  "mcpServers": [
    {
      "name": "consolidation_memory",
      "command": "consolidation-memory"
    }
  ]
}

Generic MCP client (any client supporting stdio transport):

{
  "command": "consolidation-memory",
  "transport": "stdio"
}

All configs above point at the default data directory. To share memories across clients with a specific project:

{
  "command": "consolidation-memory",
  "env": { "CONSOLIDATION_MEMORY_PROJECT": "my-project" }
}

Every client using the same project name reads and writes to the same database.

Data Storage

All data stays local.

Platform Path
Linux ~/.local/share/consolidation_memory/projects/<name>/
macOS ~/Library/Application Support/consolidation_memory/projects/<name>/
Windows %LOCALAPPDATA%\consolidation_memory\projects\<name>\

Switching embedding backends? consolidation-memory reindex

Roadmap

  • Hybrid search (BM25 + semantic fusion)
  • Diff-aware merge validation for consolidation
  • Query expansion for short/ambiguous recalls
  • Recall result deduplication
  • Entity extraction and relationship graph
  • Entity-aware recall boosting
  • First-party plugins (git history, project context, Obsidian export)

Development

git clone https://github.com/charliee1w/consolidation-memory
cd consolidation-memory
pip install -e ".[all,dev]"
pytest tests/ -v
ruff check src/ tests/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

consolidation_memory-0.12.1.tar.gz (146.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

consolidation_memory-0.12.1-py3-none-any.whl (111.6 kB view details)

Uploaded Python 3

File details

Details for the file consolidation_memory-0.12.1.tar.gz.

File metadata

  • Download URL: consolidation_memory-0.12.1.tar.gz
  • Upload date:
  • Size: 146.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for consolidation_memory-0.12.1.tar.gz
Algorithm Hash digest
SHA256 54bdd62443366e9ac79fd649df18d28d7a504696564af9bf0336d65a0a1601df
MD5 595a5ddf3a167e49049ed901f1a10074
BLAKE2b-256 b419700c9b8437cfd94a13f300dfdfce85c308e8607ac26f4d5c36e5b53136ba

See more details on using hashes here.

Provenance

The following attestation bundles were made for consolidation_memory-0.12.1.tar.gz:

Publisher: publish.yml on charliee1w/consolidation-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file consolidation_memory-0.12.1-py3-none-any.whl.

File metadata

File hashes

Hashes for consolidation_memory-0.12.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6b1987e541a5100af3f058112b1768a2ee372b46ae350f5436a8b457a177e119
MD5 8e79e8ff6c795c956fd51564dd83ac0a
BLAKE2b-256 e068725e6cc17e565962e25b005484db1694820c78cbc07401eb723c1e28cd54

See more details on using hashes here.

Provenance

The following attestation bundles were made for consolidation_memory-0.12.1-py3-none-any.whl:

Publisher: publish.yml on charliee1w/consolidation-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page