Local-first persistent memory for AI agents — store, recall, and consolidate knowledge across sessions using FAISS, SQLite, and any LLM
Project description
consolidation-memory
Local-first persistent memory for AI agents. SQLite + FAISS, runs on a laptop, no cloud.
Agents store episodes (conversations, facts, solutions). A background thread periodically clusters related episodes and uses a local LLM to synthesize them into structured knowledge records. Old episodes get pruned. Knowledge compounds over time instead of degrading.
Install
pip install consolidation-memory[fastembed]
consolidation-memory init
consolidation-memory setup-claude # Add memory instructions to CLAUDE.md
FastEmbed runs locally. No API keys needed. The setup-claude command adds instructions to your ~/.claude/CLAUDE.md so Claude Code proactively uses memory tools.
MCP Server
{
"mcpServers": {
"consolidation_memory": {
"command": "consolidation-memory"
}
}
}
Tools: memory_store, memory_store_batch, memory_recall, memory_search, memory_status, memory_forget, memory_export, memory_correct, memory_compact, memory_consolidate, memory_browse, memory_read_topic, memory_timeline, memory_decay_report, memory_protect
Python API
from consolidation_memory import MemoryClient
with MemoryClient() as mem:
mem.store("User prefers dark mode", content_type="preference", tags=["ui"])
result = mem.recall("user interface preferences")
for ep in result.episodes:
print(ep["content"], ep["similarity"])
OpenAI Function Calling
Works with any OpenAI-compatible API (LM Studio, Ollama, OpenAI, Azure):
from consolidation_memory import MemoryClient
from consolidation_memory.schemas import openai_tools, dispatch_tool_call
mem = MemoryClient()
# Pass openai_tools to your chat completion, dispatch results with dispatch_tool_call()
REST API
pip install consolidation-memory[rest]
consolidation-memory serve --rest --port 8080
POST /memory/store | POST /memory/store/batch | POST /memory/recall | POST /memory/search | GET /memory/status | DELETE /memory/episodes/{id} | POST /memory/consolidate | POST /memory/correct | POST /memory/export | POST /memory/compact | GET /memory/browse | GET /memory/topics/{filename} | POST /memory/timeline | POST /memory/contradictions | POST /memory/protect | GET /memory/decay-report | GET /health
How Consolidation Works
store episodes → SQLite + FAISS
↓
background thread (every 6h)
↓
hierarchical clustering by similarity
↓
LLM synthesizes knowledge records
(facts, solutions, preferences, procedures)
↓
records feed back into recall, old episodes pruned
Episodes are grouped by semantic similarity using agglomerative clustering. Each cluster is matched against existing knowledge topics. The LLM either creates a new topic or merges into an existing one. Output is validated, versioned, and written as structured records with their own embeddings for independent search.
Three consecutive LLM failures trip a circuit breaker. Pruned episodes still count toward consolidation history.
Backends
Embedding
| Backend | Install | Model | Local |
|---|---|---|---|
| FastEmbed (default) | pip install consolidation-memory[fastembed] |
bge-small-en-v1.5 | Y |
| LM Studio | Built-in | nomic-embed-text-v1.5 | Y |
| Ollama | Built-in | nomic-embed-text | Y |
| OpenAI | pip install consolidation-memory[openai] |
text-embedding-3-small | N |
LLM (for consolidation)
| Backend | Requirements |
|---|---|
| LM Studio (default) | LM Studio running with any chat model |
| Ollama | Ollama running with any chat model |
| OpenAI | API key |
| Disabled | None — no consolidation, pure vector search |
Configuration
consolidation-memory init
Manual config
| Platform | Path |
|---|---|
| Linux/macOS | ~/.config/consolidation_memory/config.toml |
| Windows | %APPDATA%\consolidation_memory\config.toml |
| Override | CONSOLIDATION_MEMORY_CONFIG env var |
[embedding]
backend = "fastembed"
[llm]
backend = "lmstudio"
api_base = "http://localhost:1234/v1"
model = "qwen2.5-7b-instruct"
[consolidation]
auto_run = true
interval_hours = 6
cluster_threshold = 0.72 # default: 0.78
prune_enabled = true
prune_after_days = 60 # default: 30
Environment variable overrides
Every setting can be overridden with CONSOLIDATION_MEMORY_<FIELD_NAME>:
CONSOLIDATION_MEMORY_EMBEDDING_BACKEND=lmstudio
CONSOLIDATION_MEMORY_EMBEDDING_DIMENSION=768
CONSOLIDATION_MEMORY_LLM_BACKEND=openai
CONSOLIDATION_MEMORY_LLM_API_KEY=sk-...
CONSOLIDATION_MEMORY_CONSOLIDATION_INTERVAL_HOURS=12
CONSOLIDATION_MEMORY_CONSOLIDATION_AUTO_RUN=false
Priority: defaults < TOML < env vars < reset_config() (tests).
CLI
| Command | Description |
|---|---|
consolidation-memory serve |
Start MCP server (default) |
consolidation-memory serve --rest |
Start REST API |
consolidation-memory --project work serve |
MCP server for a specific project |
consolidation-memory init |
Interactive setup |
consolidation-memory status |
Show stats |
consolidation-memory consolidate |
Manual consolidation |
consolidation-memory export |
Export to JSON |
consolidation-memory import PATH |
Import from JSON |
consolidation-memory reindex |
Re-embed everything (after switching backends) |
consolidation-memory browse |
Browse knowledge topics |
consolidation-memory setup-claude |
Add memory instructions to CLAUDE.md |
consolidation-memory test |
Post-install verification |
consolidation-memory dashboard |
TUI dashboard |
Multi-Project
Isolate memories per project:
consolidation-memory --project work status
CONSOLIDATION_MEMORY_PROJECT=work consolidation-memory serve
MCP config for multiple projects:
{
"mcpServers": {
"memory-work": {
"command": "consolidation-memory",
"env": { "CONSOLIDATION_MEMORY_PROJECT": "work" }
},
"memory-personal": {
"command": "consolidation-memory",
"env": { "CONSOLIDATION_MEMORY_PROJECT": "personal" }
}
}
}
Each project gets its own database, vector index, and knowledge files.
Cross-Client Memory
One consolidation-memory instance serves every MCP client on your machine. Claude Code, Cursor, Windsurf, VS Code + Continue — all share the same SQLite database and FAISS index. A fact stored from Cursor is recalled in Claude Code. No cloud sync needed.
This is the local-first alternative to cloud-based memory passports. Your data never leaves your machine.
Example configs for each client
Claude Code (claude_desktop_config.json):
{
"mcpServers": {
"consolidation_memory": {
"command": "consolidation-memory"
}
}
}
Cursor (.cursor/mcp.json):
{
"mcpServers": {
"consolidation_memory": {
"command": "consolidation-memory"
}
}
}
VS Code + Continue (.continue/config.json):
{
"mcpServers": [
{
"name": "consolidation_memory",
"command": "consolidation-memory"
}
]
}
Generic MCP client (any client supporting stdio transport):
{
"command": "consolidation-memory",
"transport": "stdio"
}
All configs above point at the default data directory. To share memories across clients with a specific project:
{
"command": "consolidation-memory",
"env": { "CONSOLIDATION_MEMORY_PROJECT": "my-project" }
}
Every client using the same project name reads and writes to the same database.
Data Storage
All data stays local.
| Platform | Path |
|---|---|
| Linux | ~/.local/share/consolidation_memory/projects/<name>/ |
| macOS | ~/Library/Application Support/consolidation_memory/projects/<name>/ |
| Windows | %LOCALAPPDATA%\consolidation_memory\projects\<name>\ |
Switching embedding backends? consolidation-memory reindex
Roadmap
- Hybrid search (BM25 + semantic fusion)
- Diff-aware merge validation for consolidation
- Query expansion for short/ambiguous recalls
- Recall result deduplication
- Entity extraction and relationship graph
- Entity-aware recall boosting
- First-party plugins (git history, project context, Obsidian export)
Development
git clone https://github.com/charliee1w/consolidation-memory
cd consolidation-memory
pip install -e ".[all,dev]"
pytest tests/ -v
ruff check src/ tests/
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file consolidation_memory-0.11.0.tar.gz.
File metadata
- Download URL: consolidation_memory-0.11.0.tar.gz
- Upload date:
- Size: 132.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e98f12ca943a496c60eed05bb362992bbd829b81cc8e6d9de7c2ede0c7c18d5b
|
|
| MD5 |
97c98ac06f8e4ca8780c181e85955e91
|
|
| BLAKE2b-256 |
5945b7a18b4016ead4f06a05534f3655796a2813a1fee436379f842444ff43d1
|
Provenance
The following attestation bundles were made for consolidation_memory-0.11.0.tar.gz:
Publisher:
publish.yml on charliee1w/consolidation-memory
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
consolidation_memory-0.11.0.tar.gz -
Subject digest:
e98f12ca943a496c60eed05bb362992bbd829b81cc8e6d9de7c2ede0c7c18d5b - Sigstore transparency entry: 1011565828
- Sigstore integration time:
-
Permalink:
charliee1w/consolidation-memory@d9c11eda3c18dfed39d41106d6e7419cd755807b -
Branch / Tag:
refs/tags/v0.11.0 - Owner: https://github.com/charliee1w
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d9c11eda3c18dfed39d41106d6e7419cd755807b -
Trigger Event:
push
-
Statement type:
File details
Details for the file consolidation_memory-0.11.0-py3-none-any.whl.
File metadata
- Download URL: consolidation_memory-0.11.0-py3-none-any.whl
- Upload date:
- Size: 107.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59f2e0bd3b25d9535c7dbce266c4de853fd43b3e481c98ddf8501eccdfcc8f02
|
|
| MD5 |
9800ad4c6772aa1191aeb13c9839dfb3
|
|
| BLAKE2b-256 |
9b2499b593761996b55f598cc1ff151eff284afca9cb06f8903e7dbd392e635a
|
Provenance
The following attestation bundles were made for consolidation_memory-0.11.0-py3-none-any.whl:
Publisher:
publish.yml on charliee1w/consolidation-memory
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
consolidation_memory-0.11.0-py3-none-any.whl -
Subject digest:
59f2e0bd3b25d9535c7dbce266c4de853fd43b3e481c98ddf8501eccdfcc8f02 - Sigstore transparency entry: 1011565872
- Sigstore integration time:
-
Permalink:
charliee1w/consolidation-memory@d9c11eda3c18dfed39d41106d6e7419cd755807b -
Branch / Tag:
refs/tags/v0.11.0 - Owner: https://github.com/charliee1w
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d9c11eda3c18dfed39d41106d6e7419cd755807b -
Trigger Event:
push
-
Statement type: