Long-term memory MCP server for LLMs — hybrid search in a single SQLite file
Project description
rekal
Long-term memory for LLMs. One SQLite file, no cloud, no API keys.
rekal is an MCP server that gives Claude Code persistent memory across sessions. Memories are stored locally in SQLite and retrieved with hybrid search (BM25 keywords + vector semantics + recency decay). Nothing leaves your machine.
Session 1: "I prefer Ruff over Black" → memory_store(...)
Session 47: "Set up linting" → memory_search("formatting preferences")
← "User prefers Ruff over Black" (0.92)
Sets up Ruff without asking.
Install
pip install rekal
or with uv:
uv tool install rekal
Requires Python 3.11+. On first run, rekal creates ~/.rekal/memory.db — a single file that holds everything.
Setup
Two steps: add the MCP server, then install the skills plugin.
1. Add the MCP server — gives Claude Code the memory tools:
claude mcp add rekal rekal
2. Install the skills plugin — teaches Claude Code when and how to use those tools:
claude plugin marketplace add janbjorge/rekal
claude plugin install rekal-skills@rekal
The MCP server provides the tools. The skills drive the behavior — session capture, deduplication, hygiene. Both are required.
Skills
| Skill | Trigger | What it does |
|---|---|---|
rekal-init |
/rekal-init |
Scans codebase and bootstraps rekal with project knowledge |
rekal-save |
Auto on session end | Deduplicates and stores durable knowledge from the conversation |
rekal-usage |
/rekal-usage |
Teaches agents how to use rekal effectively |
rekal-hygiene |
/rekal-hygiene |
Finds conflicts, duplicates, stale data — proposes fixes |
Tools
rekal exposes 16 MCP tools grouped into four categories.
Core — read and write memories:
| Tool | Purpose |
|---|---|
memory_store |
Store a memory with type, project, and tags |
memory_search |
Hybrid search across all memories |
memory_update |
Edit content, tags, or type of an existing memory |
memory_delete |
Remove a memory by ID |
Smart write — manage knowledge over time:
| Tool | Purpose |
|---|---|
memory_supersede |
Replace a memory while linking the old one as history |
memory_link |
Connect memories: supersedes, contradicts, or related_to |
memory_build_context |
One call that returns relevant memories + conflicts + timeline |
Introspection — explore what's stored:
| Tool | Purpose |
|---|---|
memory_similar |
Find memories similar to a given one |
memory_topics |
Topic summary grouped by type |
memory_timeline |
Chronological view with optional date range |
memory_related |
All links to and from a memory |
memory_health |
Database stats: counts by type, project, date range |
memory_conflicts |
Find memories that contradict each other |
Conversations — track session threads:
| Tool | Purpose |
|---|---|
conversation_start |
Start a conversation, optionally linked to a previous one |
conversation_tree |
Get the full conversation DAG |
conversation_threads |
List recent conversations with memory counts |
conversation_stale |
Find inactive conversations |
How it works
Storage
Everything lives in a single SQLite file (~/.rekal/memory.db). Three subsystems share it:
- memories table — content, type, project, tags, timestamps, access counts
- FTS5 virtual table — full-text index over content+tags+project, auto-synced via triggers on insert/update/delete
- sqlite-vec virtual table — 384-dimensional vector index for semantic search
When you store a memory, rekal writes the row, updates the FTS5 index (automatically), and inserts a vector embedding. When you update content, it re-embeds automatically.
Memory links (supersedes, contradicts, related_to) are stored in a separate table. memory_supersede writes the new memory and creates a supersedes link to the old one in a single operation — old knowledge stays queryable but the link makes the lineage explicit.
Embeddings
rekal uses fastembed with the BAAI/bge-small-en-v1.5 model (384 dimensions). It runs locally via ONNX — no API calls, no network, no tokens billed. The model is downloaded once on first use (~50MB) and cached.
Vectors are stored as packed floats in sqlite-vec and queried with approximate nearest-neighbor search.
Search
Every memory_search runs two parallel lookups, merges candidates, then scores them:
1. Vector search → top 3×limit candidates by cosine distance
2. FTS5 search → top 3×limit candidates by BM25 rank
3. Union the candidate sets
4. For each candidate, compute:
score = w_fts × sigmoid(-BM25) ← keyword relevance (default 0.4)
+ w_vec × (1 - cosine_distance) ← semantic similarity (default 0.4)
+ w_recency × exp(-0.693 × days/half_life) ← recency (default 0.2, 30-day half-life)
5. Sort by score, return top limit
Why three signals? Keywords alone miss synonyms ("deploy" vs "ship to prod"). Vectors alone miss exact identifiers (BAAI/bge-small-en-v1.5 needs exact match). Recency alone buries important old knowledge. The blend covers all three failure modes.
Why 0.4/0.4/0.2 defaults? Keywords and semantics contribute equally — neither dominates. Recency is a tiebreaker at 0.2: a one-day-old memory scores ~0.195, a 90-day-old memory still scores ~0.025. Old memories surface when keyword or semantic match is strong enough.
Configurable weights. All weights and the half-life are configurable at two levels:
- Per project —
memory_set_config(key="w_recency", value="0.5", project="my-app")persists in the database across sessions. Searches scoped to that project automatically use its config. - Per search — pass
w_fts,w_vec,w_recency, orhalf_lifedirectly tomemory_searchormemory_build_contextto override for a single query.
Lookup order: per-search params > project config > hardcoded defaults (0.4/0.4/0.2, 30-day half-life).
Why over-fetch 3x? Filtering by project/type/conversation happens after scoring (no dynamic SQL injection). Over-fetching ensures enough candidates survive filtering to fill the requested limit.
Why SQLite?
- Single file — copy it, back it up, version-control it, delete it to start fresh
- Zero config — no daemon, no port, no connection string
- FTS5 built-in — BM25 ranking with no external search engine
- sqlite-vec extension — vector search in the same process, no separate vector DB
- Sub-millisecond — everything is local disk I/O, no network round-trips
- Portable — works on macOS, Linux, Windows without different backends
CLI
rekal serve # Run as MCP server (default)
rekal health # Database health report
rekal export # Export all memories as JSON
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rekal-1.6.0.tar.gz.
File metadata
- Download URL: rekal-1.6.0.tar.gz
- Upload date:
- Size: 148.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87a14911e2f31de460dab7f69072504cdea9c6f6da9d340d94fabcede367907a
|
|
| MD5 |
14deb661ee0480c198b10c5182308200
|
|
| BLAKE2b-256 |
20cdeafa02f8f9e08b882e2425680ebb3a49676340d09ac4addf14ce39147922
|
Provenance
The following attestation bundles were made for rekal-1.6.0.tar.gz:
Publisher:
release.yml on janbjorge/rekal
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rekal-1.6.0.tar.gz -
Subject digest:
87a14911e2f31de460dab7f69072504cdea9c6f6da9d340d94fabcede367907a - Sigstore transparency entry: 1293442315
- Sigstore integration time:
-
Permalink:
janbjorge/rekal@5aa31d63d87c0f9022883b3caacbbfb76c85fa4f -
Branch / Tag:
refs/tags/v1.6.0 - Owner: https://github.com/janbjorge
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5aa31d63d87c0f9022883b3caacbbfb76c85fa4f -
Trigger Event:
push
-
Statement type:
File details
Details for the file rekal-1.6.0-py3-none-any.whl.
File metadata
- Download URL: rekal-1.6.0-py3-none-any.whl
- Upload date:
- Size: 22.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1bf24bb494af9ddea6c6cec918abe15cbd58649bb4daa62649eee477eec8bfdc
|
|
| MD5 |
368c608d3d6a7555e627118511157916
|
|
| BLAKE2b-256 |
034cf4964d1bc83973eabaf952f16c330f47b8dff88015837a0acd9eeadad73f
|
Provenance
The following attestation bundles were made for rekal-1.6.0-py3-none-any.whl:
Publisher:
release.yml on janbjorge/rekal
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rekal-1.6.0-py3-none-any.whl -
Subject digest:
1bf24bb494af9ddea6c6cec918abe15cbd58649bb4daa62649eee477eec8bfdc - Sigstore transparency entry: 1293442323
- Sigstore integration time:
-
Permalink:
janbjorge/rekal@5aa31d63d87c0f9022883b3caacbbfb76c85fa4f -
Branch / Tag:
refs/tags/v1.6.0 - Owner: https://github.com/janbjorge
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5aa31d63d87c0f9022883b3caacbbfb76c85fa4f -
Trigger Event:
push
-
Statement type: