Skip to main content

Long-term memory MCP server for LLMs — hybrid search in a single SQLite file

Project description

rekal

Long-term memory for LLMs. One SQLite file, no cloud, no API keys.

rekal is an MCP server that gives Claude Code persistent memory across sessions. Memories are stored locally in SQLite and retrieved with hybrid search (BM25 keywords + vector semantics + recency decay). Nothing leaves your machine.

Session 1:   "I prefer Ruff over Black"  → memory_store(...)
Session 47:  "Set up linting"            → memory_search("formatting preferences")
                                          ← "User prefers Ruff over Black" (0.92)
                                          Sets up Ruff without asking.

Install

pip install rekal

or with uv:

uv tool install rekal

Requires Python 3.11+. On first run, rekal creates ~/.rekal/memory.db — a single file that holds everything.

Setup

Two steps: add the MCP server, then install the skills plugin.

1. Add the MCP server — gives Claude Code the memory tools:

claude mcp add rekal rekal

2. Install the skills plugin — teaches Claude Code when and how to use those tools:

claude plugin marketplace add janbjorge/rekal
claude plugin install rekal-skills@rekal

The MCP server provides the tools. The skills drive the behavior — session capture, deduplication, hygiene. Both are required.

Skills

Skill Trigger What it does
rekal-init /rekal-init Scans codebase and bootstraps rekal with project knowledge
rekal-save Auto on session end Deduplicates and stores durable knowledge from the conversation
rekal-usage /rekal-usage Teaches agents how to use rekal effectively
rekal-hygiene /rekal-hygiene Finds conflicts, duplicates, stale data — proposes fixes

Tools

rekal exposes 16 MCP tools grouped into four categories.

Core — read and write memories:

Tool Purpose
memory_store Store a memory with type, project, and tags
memory_search Hybrid search across all memories
memory_update Edit content, tags, or type of an existing memory
memory_delete Remove a memory by ID

Smart write — manage knowledge over time:

Tool Purpose
memory_supersede Replace a memory while linking the old one as history
memory_link Connect memories: supersedes, contradicts, or related_to
memory_build_context One call that returns relevant memories + conflicts + timeline

Introspection — explore what's stored:

Tool Purpose
memory_similar Find memories similar to a given one
memory_topics Topic summary grouped by type
memory_timeline Chronological view with optional date range
memory_related All links to and from a memory
memory_health Database stats: counts by type, project, date range
memory_conflicts Find memories that contradict each other

Conversations — track session threads:

Tool Purpose
conversation_start Start a conversation, optionally linked to a previous one
conversation_tree Get the full conversation DAG
conversation_threads List recent conversations with memory counts
conversation_stale Find inactive conversations

How it works

Storage

Everything lives in a single SQLite file (~/.rekal/memory.db). Three subsystems share it:

  • memories table — content, type, project, tags, timestamps, access counts
  • FTS5 virtual table — full-text index over content+tags+project, auto-synced via triggers on insert/update/delete
  • sqlite-vec virtual table — 384-dimensional vector index for semantic search

When you store a memory, rekal writes the row, updates the FTS5 index (automatically), and inserts a vector embedding. When you update content, it re-embeds automatically.

Memory links (supersedes, contradicts, related_to) are stored in a separate table. memory_supersede writes the new memory and creates a supersedes link to the old one in a single operation — old knowledge stays queryable but the link makes the lineage explicit.

Embeddings

rekal uses fastembed with the BAAI/bge-small-en-v1.5 model (384 dimensions). It runs locally via ONNX — no API calls, no network, no tokens billed. The model is downloaded once on first use (~50MB) and cached.

Vectors are stored as packed floats in sqlite-vec and queried with approximate nearest-neighbor search.

Search

Every memory_search runs two parallel lookups, merges candidates, then scores them:

1. Vector search   → top 3×limit candidates by cosine distance
2. FTS5 search     → top 3×limit candidates by BM25 rank
3. Union the candidate sets
4. For each candidate, compute:

   score = w_fts × sigmoid(-BM25)        ← keyword relevance     (default 0.4)
         + w_vec × (1 - cosine_distance)  ← semantic similarity  (default 0.4)
         + w_recency × exp(-0.693 × days/half_life)  ← recency  (default 0.2, 30-day half-life)

5. Sort by score, return top limit

Why three signals? Keywords alone miss synonyms ("deploy" vs "ship to prod"). Vectors alone miss exact identifiers (BAAI/bge-small-en-v1.5 needs exact match). Recency alone buries important old knowledge. The blend covers all three failure modes.

Why 0.4/0.4/0.2 defaults? Keywords and semantics contribute equally — neither dominates. Recency is a tiebreaker at 0.2: a one-day-old memory scores ~0.195, a 90-day-old memory still scores ~0.025. Old memories surface when keyword or semantic match is strong enough.

Configurable weights. All weights and the half-life are configurable at three levels:

  • Per search — pass w_fts, w_vec, w_recency, or half_life directly to memory_search or memory_build_context to override for a single query.
  • Per project (database)memory_set_config(key="w_recency", value="0.5", project="my-app") persists in the database across sessions. Searches scoped to that project automatically use its config.
  • Per project (file) — drop a .rekal/config.yml in your project root with version-controlled defaults:
scoring:
  w_fts: 0.6
  w_vec: 0.3
  w_recency: 0.1
  half_life: 14.0

rekal looks for this file in the working directory at startup. All keys are optional.

Precedence. Each weight is resolved independently through four layers. The first layer that provides a value wins:

Priority Source Set by Persists?
1 (highest) Per-search params memory_search(..., w_fts=0.8) No — single query only
2 Database project config memory_set_config(key, value, project) Yes — in SQLite, across sessions
3 .rekal/config.yml Checked into version control Yes — shared with the team
4 (lowest) Hardcoded defaults Built into rekal Always: 0.4 / 0.4 / 0.2, 30-day half-life

Layers are per-key, not all-or-nothing. If your .rekal/config.yml sets w_fts and half_life, and a memory_set_config call overrides w_fts in the database, the final weights for a search with no explicit params would be: w_fts from DB (layer 2), half_life from file (layer 3), w_vec and w_recency from hardcoded defaults (layer 4).

Why over-fetch 3x? Filtering by project/type/conversation happens after scoring (no dynamic SQL injection). Over-fetching ensures enough candidates survive filtering to fill the requested limit.

Why SQLite?

  • Single file — copy it, back it up, version-control it, delete it to start fresh
  • Zero config — no daemon, no port, no connection string
  • FTS5 built-in — BM25 ranking with no external search engine
  • sqlite-vec extension — vector search in the same process, no separate vector DB
  • Sub-millisecond — everything is local disk I/O, no network round-trips
  • Portable — works on macOS, Linux, Windows without different backends

CLI

rekal serve    # Run as MCP server (default)
rekal health   # Database health report
rekal export   # Export all memories as JSON

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rekal-1.7.0.tar.gz (150.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rekal-1.7.0-py3-none-any.whl (23.4 kB view details)

Uploaded Python 3

File details

Details for the file rekal-1.7.0.tar.gz.

File metadata

  • Download URL: rekal-1.7.0.tar.gz
  • Upload date:
  • Size: 150.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rekal-1.7.0.tar.gz
Algorithm Hash digest
SHA256 a365b994f3d8f7e786e7d2027a252dada8f902a75f2a8e4703dde9e90028d04e
MD5 c4d4bd59bf48b0dd53f6b93ba71f0371
BLAKE2b-256 a78d8f229db27db6f658532203d691282b061acc4449effd21b2292157ef05cf

See more details on using hashes here.

Provenance

The following attestation bundles were made for rekal-1.7.0.tar.gz:

Publisher: release.yml on janbjorge/rekal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rekal-1.7.0-py3-none-any.whl.

File metadata

  • Download URL: rekal-1.7.0-py3-none-any.whl
  • Upload date:
  • Size: 23.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rekal-1.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3831958006ab74cbffcab097c8f568e89cb55fb18eed0992f3eafeb1e1c30fd2
MD5 2dfbc81583240eeb3b7dc07e281ee504
BLAKE2b-256 79dc4408a0d3cc682a55b1c0e12082501eb3e049ac7118eae17d3c71407be556

See more details on using hashes here.

Provenance

The following attestation bundles were made for rekal-1.7.0-py3-none-any.whl:

Publisher: release.yml on janbjorge/rekal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page