Skip to main content

Local-first persistent memory for AI agents — store, recall, and consolidate knowledge across sessions using FAISS, SQLite, and any LLM

Project description

consolidation-memory

PyPI CI Python License

Local-first persistent memory for AI agents. SQLite + FAISS, runs on a laptop, no cloud.

Agents store episodes (conversations, facts, solutions). A background thread periodically clusters related episodes and uses a local LLM to synthesize them into structured knowledge records. Old episodes get pruned. Knowledge compounds over time instead of degrading.

Install

pip install consolidation-memory[fastembed]
consolidation-memory init
consolidation-memory setup-claude  # Add memory instructions to CLAUDE.md

FastEmbed runs locally. No API keys needed. The setup-claude command adds instructions to your ~/.claude/CLAUDE.md so Claude Code proactively uses memory tools.

MCP Server

{
  "mcpServers": {
    "consolidation_memory": {
      "command": "consolidation-memory"
    }
  }
}

Tools: memory_store, memory_store_batch, memory_recall, memory_search, memory_status, memory_forget, memory_export, memory_correct, memory_compact, memory_consolidate, memory_browse, memory_read_topic, memory_timeline, memory_decay_report, memory_protect

Python API

from consolidation_memory import MemoryClient

with MemoryClient() as mem:
    mem.store("User prefers dark mode", content_type="preference", tags=["ui"])

    result = mem.recall("user interface preferences")
    for ep in result.episodes:
        print(ep["content"], ep["similarity"])

OpenAI Function Calling

Works with any OpenAI-compatible API (LM Studio, Ollama, OpenAI, Azure):

from consolidation_memory import MemoryClient
from consolidation_memory.schemas import openai_tools, dispatch_tool_call

mem = MemoryClient()
# Pass openai_tools to your chat completion, dispatch results with dispatch_tool_call()

REST API

pip install consolidation-memory[rest]
consolidation-memory serve --rest --port 8080

POST /memory/store | POST /memory/store/batch | POST /memory/recall | POST /memory/search | GET /memory/status | DELETE /memory/episodes/{id} | POST /memory/consolidate | POST /memory/correct | POST /memory/export | GET /health

How Consolidation Works

store episodes → SQLite + FAISS
                      ↓
        background thread (every 6h)
                      ↓
     hierarchical clustering by similarity
                      ↓
        LLM synthesizes knowledge records
        (facts, solutions, preferences, procedures)
                      ↓
     records feed back into recall, old episodes pruned

Episodes are grouped by semantic similarity using agglomerative clustering. Each cluster is matched against existing knowledge topics. The LLM either creates a new topic or merges into an existing one. Output is validated, versioned, and written as structured records with their own embeddings for independent search.

Three consecutive LLM failures trip a circuit breaker. Pruned episodes still count toward consolidation history.

Backends

Embedding

Backend Install Model Local
FastEmbed (default) pip install consolidation-memory[fastembed] bge-small-en-v1.5 Y
LM Studio Built-in nomic-embed-text-v1.5 Y
Ollama Built-in nomic-embed-text Y
OpenAI pip install consolidation-memory[openai] text-embedding-3-small N

LLM (for consolidation)

Backend Requirements
LM Studio (default) LM Studio running with any chat model
Ollama Ollama running with any chat model
OpenAI API key
Disabled None — no consolidation, pure vector search

Configuration

consolidation-memory init
Manual config
Platform Path
Linux/macOS ~/.config/consolidation_memory/config.toml
Windows %APPDATA%\consolidation_memory\config.toml
Override CONSOLIDATION_MEMORY_CONFIG env var
[embedding]
backend = "fastembed"

[llm]
backend = "lmstudio"
api_base = "http://localhost:1234/v1"
model = "qwen2.5-7b-instruct"

[consolidation]
auto_run = true
interval_hours = 6
cluster_threshold = 0.72  # default: 0.78
prune_enabled = true
prune_after_days = 60  # default: 30
Environment variable overrides

Every setting can be overridden with CONSOLIDATION_MEMORY_<FIELD_NAME>:

CONSOLIDATION_MEMORY_EMBEDDING_BACKEND=lmstudio
CONSOLIDATION_MEMORY_EMBEDDING_DIMENSION=768
CONSOLIDATION_MEMORY_LLM_BACKEND=openai
CONSOLIDATION_MEMORY_LLM_API_KEY=sk-...
CONSOLIDATION_MEMORY_CONSOLIDATION_INTERVAL_HOURS=12
CONSOLIDATION_MEMORY_CONSOLIDATION_AUTO_RUN=false

Priority: defaults < TOML < env vars < reset_config() (tests).

CLI

Command Description
consolidation-memory serve Start MCP server (default)
consolidation-memory serve --rest Start REST API
consolidation-memory --project work serve MCP server for a specific project
consolidation-memory init Interactive setup
consolidation-memory status Show stats
consolidation-memory consolidate Manual consolidation
consolidation-memory export Export to JSON
consolidation-memory import PATH Import from JSON
consolidation-memory reindex Re-embed everything (after switching backends)
consolidation-memory browse Browse knowledge topics
consolidation-memory setup-claude Add memory instructions to CLAUDE.md
consolidation-memory test Post-install verification
consolidation-memory dashboard TUI dashboard

Multi-Project

Isolate memories per project:

consolidation-memory --project work status
CONSOLIDATION_MEMORY_PROJECT=work consolidation-memory serve

MCP config for multiple projects:

{
  "mcpServers": {
    "memory-work": {
      "command": "consolidation-memory",
      "env": { "CONSOLIDATION_MEMORY_PROJECT": "work" }
    },
    "memory-personal": {
      "command": "consolidation-memory",
      "env": { "CONSOLIDATION_MEMORY_PROJECT": "personal" }
    }
  }
}

Each project gets its own database, vector index, and knowledge files.

Cross-Client Memory

One consolidation-memory instance serves every MCP client on your machine. Claude Code, Cursor, Windsurf, VS Code + Continue — all share the same SQLite database and FAISS index. A fact stored from Cursor is recalled in Claude Code. No cloud sync needed.

This is the local-first alternative to cloud-based memory passports. Your data never leaves your machine.

Example configs for each client

Claude Code (claude_desktop_config.json):

{
  "mcpServers": {
    "consolidation_memory": {
      "command": "consolidation-memory"
    }
  }
}

Cursor (.cursor/mcp.json):

{
  "mcpServers": {
    "consolidation_memory": {
      "command": "consolidation-memory"
    }
  }
}

VS Code + Continue (.continue/config.json):

{
  "mcpServers": [
    {
      "name": "consolidation_memory",
      "command": "consolidation-memory"
    }
  ]
}

Generic MCP client (any client supporting stdio transport):

{
  "command": "consolidation-memory",
  "transport": "stdio"
}

All configs above point at the default data directory. To share memories across clients with a specific project:

{
  "command": "consolidation-memory",
  "env": { "CONSOLIDATION_MEMORY_PROJECT": "my-project" }
}

Every client using the same project name reads and writes to the same database.

Data Storage

All data stays local.

Platform Path
Linux ~/.local/share/consolidation_memory/projects/<name>/
macOS ~/Library/Application Support/consolidation_memory/projects/<name>/
Windows %LOCALAPPDATA%\consolidation_memory\projects\<name>\

Switching embedding backends? consolidation-memory reindex

Development

git clone https://github.com/charliee1w/consolidation-memory
cd consolidation-memory
pip install -e ".[all,dev]"
pytest tests/ -v
ruff check src/ tests/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

consolidation_memory-0.10.0.tar.gz (115.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

consolidation_memory-0.10.0-py3-none-any.whl (98.3 kB view details)

Uploaded Python 3

File details

Details for the file consolidation_memory-0.10.0.tar.gz.

File metadata

  • Download URL: consolidation_memory-0.10.0.tar.gz
  • Upload date:
  • Size: 115.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for consolidation_memory-0.10.0.tar.gz
Algorithm Hash digest
SHA256 7ac5d9bf23625a4c4d5838e36496657469a7053f575a48484c1ff66e6705176d
MD5 c802ad1ef6bc4886295ddb6a74e66747
BLAKE2b-256 9464ac3d77f60a506d6ef42a2173df7d3e2b74e0d883fee76829d36177d0697f

See more details on using hashes here.

Provenance

The following attestation bundles were made for consolidation_memory-0.10.0.tar.gz:

Publisher: publish.yml on charliee1w/consolidation-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file consolidation_memory-0.10.0-py3-none-any.whl.

File metadata

File hashes

Hashes for consolidation_memory-0.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ba5291c754b146bf28ca5dba4c414b4f9f271fa76920c3e412de584c4ae5ae87
MD5 da357cd65b3d21ffc74f0f7fd5987ac8
BLAKE2b-256 f68b05313fbf31eca21db36d42559342f33bf4c3699cd8490e9ca5713b976d5e

See more details on using hashes here.

Provenance

The following attestation bundles were made for consolidation_memory-0.10.0-py3-none-any.whl:

Publisher: publish.yml on charliee1w/consolidation-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page