Local-first persistent memory for AI agents — store, recall, and consolidate knowledge across sessions using FAISS, SQLite, and any LLM

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

charliee1w

These details have not been verified by PyPI

Project description

consolidation-memory

Your AI forgets everything between sessions. This fixes that.

A local-first memory system that stores, retrieves, and consolidates knowledge across conversations. Episodes go in, structured knowledge comes out — automatically, via a background LLM that clusters and synthesizes what it's learned.

No cloud dependency. No subscriptions. Your data stays on your machine.

You: "My build is failing with a linker error"
AI:  (recalls your project uses CMake + MSVC on Windows)
     (recalls you hit the same error last month — it was a missing vcpkg dependency)
     "Last time this happened it was a missing vcpkg package. Want me to
      check if your vcpkg.json changed since we fixed it?"

How It Works

 ┌─────────┐     ┌───────────┐     ┌────────────┐
 │  Store   │────▶│  Embed    │────▶│ FAISS Index │
 │ episodes │     │ (any LLM) │     │ + SQLite DB │
 └─────────┘     └───────────┘     └──────┬─────┘
                                          │
                 ┌───────────┐     ┌──────▼─────┐
                 │ Knowledge │◀────│   Recall    │
                 │   Docs    │     │ (semantic)  │
                 └─────┬─────┘     └────────────┘
                       │
                ┌──────▼──────┐
                │ Consolidate │  ← background thread
                │ (cluster +  │    clusters episodes
                │  LLM synth) │    into knowledge docs
                └─────────────┘

Store — Save episodes (facts, solutions, preferences) with embeddings into SQLite + FAISS
Recall — Semantic search with priority scoring (surprise, recency, access frequency)
Consolidate — Background LLM clusters related episodes and synthesizes structured markdown knowledge documents

How Consolidation Works

The consolidation engine runs on a background daemon thread (default: every 6 hours). It fetches all unconsolidated episodes, embeds them, and groups them using agglomerative hierarchical clustering with a configurable distance threshold. Each cluster represents a coherent topic.

For each cluster, the engine checks existing knowledge topics for semantic overlap. If a matching topic exists (above the topic-match threshold), the cluster's episodes are merged into the existing document. Otherwise, a new knowledge document is synthesized from scratch.

The LLM receives the cluster's episodes (with prompt injection patterns sanitized) and produces a structured markdown document with YAML frontmatter (title, summary, tags, confidence score). The engine validates the output, versions the previous document, writes the new one, and updates the SQLite metadata. Episodes that have been consolidated and aged past the prune threshold are soft-deleted to keep the FAISS index lean.

All backends retry transient failures with exponential backoff. If 3 consecutive clusters fail (indicating the LLM backend is down), consolidation aborts early rather than burning through timeouts.

Quick Start

pip install consolidation-memory[fastembed]
consolidation-memory init

That's it. FastEmbed runs locally, no external services needed.

MCP Server (Claude Desktop / Claude Code / Cursor)

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "consolidation_memory": {
      "command": "consolidation-memory"
    }
  }
}

Nine tools become available:

Tool	What it does
`memory_store`	Save an episode (fact, solution, preference, exchange)
`memory_store_batch`	Store multiple episodes in one call (single embed + FAISS batch)
`memory_recall`	Semantic search over episodes + knowledge, with optional filters
`memory_search`	Keyword/metadata search — works without embedding backend
`memory_status`	System stats + health diagnostics + consolidation metrics
`memory_forget`	Soft-delete an episode
`memory_export`	Export everything to JSON
`memory_correct`	Fix outdated knowledge documents

memory_recall supports optional filters: content_types, tags, after, before — all applied post-vector-search so you can narrow results to specific episode types or date ranges.

memory_search does plain text LIKE matching in SQLite. No embedding backend needed. Supports the same filters (content_types, tags, after, before) plus a limit parameter.

Python API

from consolidation_memory import MemoryClient

with MemoryClient() as mem:
    mem.store("User prefers dark mode", content_type="preference", tags=["ui"])

    result = mem.recall("user interface preferences")
    for ep in result.episodes:
        print(ep["content"], ep["similarity"])

    stats = mem.status()
    print(stats.health)  # {"status": "healthy", "issues": [], "backend_reachable": true}

OpenAI Function Calling

Works with any OpenAI-compatible API (LM Studio, Ollama, OpenAI, Azure):

from consolidation_memory import MemoryClient
from consolidation_memory.schemas import openai_tools, dispatch_tool_call

mem = MemoryClient()
# Pass openai_tools to your chat completion, dispatch results with dispatch_tool_call()

REST API

pip install consolidation-memory[rest]
consolidation-memory serve --rest --port 8080

Method	Path	Description
`GET`	`/health`	Version + status
`POST`	`/memory/store`	Store episode
`POST`	`/memory/store/batch`	Store multiple episodes
`POST`	`/memory/recall`	Semantic search (with optional filters)
`POST`	`/memory/search`	Keyword/metadata search (no embedding needed)
`GET`	`/memory/status`	System statistics + consolidation metrics
`DELETE`	`/memory/episodes/{id}`	Forget episode
`POST`	`/memory/consolidate`	Trigger consolidation
`POST`	`/memory/correct`	Correct knowledge doc
`POST`	`/memory/export`	Export to JSON

Embedding Backends

Backend	Install	Model	Dimensions	Runs locally?
FastEmbed (default)	`pip install consolidation-memory[fastembed]`	bge-small-en-v1.5	384	Yes
LM Studio	Built-in	nomic-embed-text-v1.5	768	Yes
Ollama	Built-in	nomic-embed-text	768	Yes
OpenAI	`pip install consolidation-memory[openai]`	text-embedding-3-small	1536	No

LLM Backends (for consolidation)

The consolidation step needs a chat-capable LLM to synthesize clusters into knowledge documents. Set backend = "disabled" to skip consolidation and use store/recall only.

Backend	Requirements
LM Studio (default)	LM Studio running with any chat model
Ollama	Ollama running with any chat model
OpenAI	API key
Disabled	None — no consolidation, pure vector search

Configuration

consolidation-memory init  # Interactive setup

Or edit the config directly:

Platform	Path
Linux/macOS	`~/.config/consolidation_memory/config.toml`
Windows	`%APPDATA%\consolidation_memory\config.toml`
Override	`CONSOLIDATION_MEMORY_CONFIG` env var

[embedding]
backend = "fastembed"

[llm]
backend = "lmstudio"
api_base = "http://localhost:1234/v1"
model = "qwen2.5-7b-instruct"

[consolidation]
auto_run = true
interval_hours = 6
cluster_threshold = 0.72
prune_enabled = true
prune_after_days = 60

CLI

consolidation-memory serve              # MCP server (default)
consolidation-memory serve --rest       # REST API
consolidation-memory init               # Interactive setup
consolidation-memory status             # Show stats
consolidation-memory consolidate        # Manual consolidation
consolidation-memory export             # Export to JSON
consolidation-memory import PATH        # Import from JSON
consolidation-memory reindex            # Re-embed everything (after switching backends)

Data Storage

All data stays local:

Platform	Path
Linux	`~/.local/share/consolidation_memory/`
macOS	`~/Library/Application Support/consolidation_memory/`
Windows	`%LOCALAPPDATA%\consolidation_memory\`

Override with data_dir under [paths] in config.

Migrating

Already have a data directory? Point your config at it:

[paths]
data_dir = "/path/to/your/existing/data"

Switching embedding backends (different dimensions)?

consolidation-memory reindex

Development

git clone https://github.com/charliee1w/consolidation-memory
cd consolidation-memory
pip install -e ".[fastembed,dev]"
python -m pytest tests/ -v      # 88 tests, no external services needed
python -m ruff check src/ tests/

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

charliee1w

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.15.0

Mar 28, 2026

0.14.2

Mar 19, 2026

0.14.1

Mar 19, 2026

0.14.0

Mar 19, 2026

0.13.7

Mar 13, 2026

0.13.6

Mar 10, 2026

0.13.5

Mar 8, 2026

0.13.1

Mar 8, 2026

0.13.0

Mar 7, 2026

0.12.4

Mar 6, 2026

0.12.3

Mar 3, 2026

0.12.2

Mar 3, 2026

0.12.1

Mar 2, 2026

0.12.0

Mar 2, 2026

0.11.0

Mar 2, 2026

0.10.0

Mar 1, 2026

0.9.0

Mar 1, 2026

0.8.3

Mar 1, 2026

0.8.2

Mar 1, 2026

0.8.1

Mar 1, 2026

0.8.0

Mar 1, 2026

0.7.0

Feb 28, 2026

0.6.0

Feb 28, 2026

0.5.0

Feb 28, 2026

0.4.0

Feb 28, 2026

This version

0.3.0

Feb 28, 2026

0.2.0

Feb 25, 2026

0.1.0

Feb 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

consolidation_memory-0.3.0.tar.gz (73.5 kB view details)

Uploaded Feb 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

consolidation_memory-0.3.0-py3-none-any.whl (66.9 kB view details)

Uploaded Feb 28, 2026 Python 3

File details

Details for the file consolidation_memory-0.3.0.tar.gz.

File metadata

Download URL: consolidation_memory-0.3.0.tar.gz
Upload date: Feb 28, 2026
Size: 73.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for consolidation_memory-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`a933c3253071b5aa3edaf1d8080d8ffee20a72661c50c7ff4c6aa0d38071a13b`
MD5	`c79a57e5a761836b870b9d3c80ea41ee`
BLAKE2b-256	`48b9781db0a551130fcc9872b501df1febef9552ebc324195db1ba22bbc300e2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for consolidation_memory-0.3.0.tar.gz:

Publisher: publish.yml on charliee1w/consolidation-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: consolidation_memory-0.3.0.tar.gz
- Subject digest: a933c3253071b5aa3edaf1d8080d8ffee20a72661c50c7ff4c6aa0d38071a13b
- Sigstore transparency entry: 1005140928
- Sigstore integration time: Feb 28, 2026
Source repository:
- Permalink: charliee1w/consolidation-memory@17cc649d95a523281d2261149590d2b515a96488
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/charliee1w
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@17cc649d95a523281d2261149590d2b515a96488
- Trigger Event: push

File details

Details for the file consolidation_memory-0.3.0-py3-none-any.whl.

File metadata

Download URL: consolidation_memory-0.3.0-py3-none-any.whl
Upload date: Feb 28, 2026
Size: 66.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for consolidation_memory-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c73ff880a9a729011b2408644bb3894c8be3164bb6a9a1bef4b6f3bbd930398c`
MD5	`9f2a59d18553f8001508d6ea0c4b5ea7`
BLAKE2b-256	`368960be94803f32147f16a2f0182cbb7ad40d7daed5690b190ccfba23d56e1e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for consolidation_memory-0.3.0-py3-none-any.whl:

Publisher: publish.yml on charliee1w/consolidation-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: consolidation_memory-0.3.0-py3-none-any.whl
- Subject digest: c73ff880a9a729011b2408644bb3894c8be3164bb6a9a1bef4b6f3bbd930398c
- Sigstore transparency entry: 1005140931
- Sigstore integration time: Feb 28, 2026
Source repository:
- Permalink: charliee1w/consolidation-memory@17cc649d95a523281d2261149590d2b515a96488
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/charliee1w
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@17cc649d95a523281d2261149590d2b515a96488
- Trigger Event: push

consolidation-memory 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

consolidation-memory

How It Works

How Consolidation Works

Quick Start

MCP Server (Claude Desktop / Claude Code / Cursor)

Python API

OpenAI Function Calling

REST API

Embedding Backends

LLM Backends (for consolidation)

Configuration

CLI

Data Storage

Migrating

Development

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance