Skip to main content

grep, but semantic, for your Claude Code conversation history. Local-first, hybrid BM25+vector retrieval.

Project description

claude-recall

Semantic search across your Claude Code conversation history. Local-first, private, fast.

Every Claude Code session vanishes the moment you close it. recall indexes your ~/.claude/projects/ JSONL files into a single SQLite database and gives you a fast CLI to find any past turn — by keyword now, by meaning soon.

$ recall search "sqlite vec hybrid"
1. 2026-04-29  claude-recall  score=-18.4
   how should we store the chunk vectors — sqlite-vec or lancedb?
   [assistant] for <10M vectors sqlite-vec wins on ops simplicity…

2. 2026-04-22  gstack         score=-14.1
   …

Status

Week 3 / 6 — Daily-driver UX is live: show, inject, export, the watch daemon, a --rerank flag, and an inline filter DSL (project: since: role: tool:). Hybrid + reranker eval numbers below. See PLAN.md.

Install

# via uv (recommended)
uv tool install claude-recall

# or pipx
pipx install claude-recall

# first run
recall index
recall search "your query"

Dev install:

git clone https://github.com/lbbstarry/claude-recall
cd claude-recall
uv sync
uv run recall index

Commands

Command What it does
recall index Scan ~/.claude/projects/ and index + embed new/changed sessions (incremental via mtime + sha256). --no-embed for BM25-only.
recall search <q> Hybrid BM25 + vector search by default. --mode bm25|vector|hybrid, --rerank, --limit N, --project NAME. Inline filters: project:foo since:7d role:user tool:Bash.
recall show <session> Render a session as Markdown. Accepts an 8+ char prefix; use --turn N to view one turn.
recall inject <chunk> Copy a chunk's text to your clipboard so you can paste it into a new Claude session.
recall export <session> -o file.md Export a session to disk as Markdown.
recall watch Re-index in the background as JSONL files change (debounced; uses polling on WSL2).
recall stats Index size, vector count, DB path.
python -m claude_recall.eval.run Run the labeled eval set, print Recall@10 / MRR / nDCG@10, write benchmarks/eval_results.md.

Coming in Week 4+:

  • v0.2 on PyPI + Show HN
  • recall serve — local FastAPI + HTMX UI
  • bge-m3 + query expansion + ablation blog post

Architecture

~/.claude/projects/*.jsonl
    │
    ▼
parsers/claude_code.py    typed Message records
    │
    ▼
ingest/chunker.py         per-turn chunks (one user msg + following assistant)
    │
    ▼
embed/local.py            sentence-transformers (bge-small-zh) + disk cache
    │
    ▼
store/                    SQLite + FTS5 + sqlite-vec (vec0 virtual table)
    │
    ▼
search/                   bm25 · vector (KNN) · hybrid (RRF) · rerank (cross-encoder)
    │
    ▼
cli.py                    Typer app

Why per-turn chunks? Per-message loses Q-A pairing; sliding-window inflates the index 3-5×. A turn averages ~1.2k tokens — perfect for bge-m3's 8k context, no truncation needed.

Why SQLite + FTS5 + (later) sqlite-vec? Single file, zero ops, ships with the wheel, hybrid search is a JOIN away. Beats Chroma at this scale.

Eval

51 hand-labeled (query, relevant_chunk_id) pairs from real developer sessions in tests/fixtures/queries.jsonl. Index size: 6,593 chunks across 127 sessions / 6 projects. Reproduce with python -m claude_recall.eval.run.

Method Recall@10 MRR nDCG@10 p95 ms
BM25 (FTS5) 0.216 0.125 0.148 2
Vector (bge-small-zh-v1.5, 512d) 0.353 0.175 0.217 13
Hybrid (RRF) 0.392 0.175 0.228 16
Hybrid + rerank (bge-reranker-base) 0.471 0.230 0.289 214

Hybrid + rerank gives +118% Recall@10 and +95% nDCG@10 over BM25. Reranker latency is dominated by CPU cross-encoder inference; GPU or bge-reranker-v2-m3-onnx will reduce it. Numbers are CPU-only on a WSL2 Ryzen laptop.

Why the absolute numbers look modest: the eval queries are deliberately short (median 5 chars) and developer-domain-specific, e.g. prefab, figma示例. That is the realistic distribution for "I vaguely remember talking about this last month" — and the gap between methods, not the absolute floor, is what matters. Query expansion and bge-m3 (8k context, multilingual) are next.

Roadmap

  • Week 1 — Typer CLI, SQLite + FTS5, incremental ingest, BM25 search
  • Week 2 — bge-small-zh embeddings, sqlite-vec, RRF hybrid, reranker, 51-query eval
  • Week 3show/inject/export, watch daemon, filter DSL, --rerank flag
  • Week 4 — v0.2 on PyPI, Show HN, eval blog post
  • Week 5recall serve (FastAPI + HTMX local UI)
  • Week 6 — v1.0, Pro tier (cloud sync, Voyage embedder), Product Hunt

Privacy

Everything stays on your machine. The index is a single SQLite file under your OS's user data dir (~/.local/share/claude-recall/recall.db on Linux). No network calls are made by the OSS build.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claudegrep-0.2.0.tar.gz (125.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

claudegrep-0.2.0-py3-none-any.whl (26.2 kB view details)

Uploaded Python 3

File details

Details for the file claudegrep-0.2.0.tar.gz.

File metadata

  • Download URL: claudegrep-0.2.0.tar.gz
  • Upload date:
  • Size: 125.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for claudegrep-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b7587e8ce8668db830f882ce9128521c7aca3fb3bfe9b6ae5425bb03b2f9dd86
MD5 a6ea1ce78424a32a54a98f6a5203f75f
BLAKE2b-256 30e1cf29d774259d8ce4c408dd7d50989ae5f2e89529ab7b520df42a01605cd9

See more details on using hashes here.

File details

Details for the file claudegrep-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: claudegrep-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 26.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for claudegrep-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 242d994eb6d91d5a729fc728497b99419d3d59402d20e25d4281b660fdb459e6
MD5 fc12424fcf1bed505ccd05c32fa405d3
BLAKE2b-256 ba89baf8ccba21c0f02bda404db753699e57632d72e30691e3312a77d8ac5978

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page