Local semantic search over Claude Code conversation history
Project description
claude-code-search
Local semantic search over your Claude Code conversations. Find past sessions by meaning, get the ID, resume them.
$ ccsearch "scheduler bug"
1. 0.39 2026-04-10 airflow 940abfde-... (754 msgs)
Investigate concurrency bugs in Airflow scheduler
• Debugged scheduler re-queueing bug
• Traced it to DAG parse retry loop
↳ Scheduler hasn't picked up the task yet...
cd ~/workspace/airflow && claude -r 940abfde-...
Install
pip install claude-code-search
Or isolated (recommended):
pipx install claude-code-search # or
uv tool install claude-code-search # fastest
Usage
ccsearch index # build the index (~5 min first run)
ccsearch "your query" # semantic search
ccsearch -g "#64827" # exact/literal search (instant, no model)
ccsearch show <id> --query "x" # see messages around a match
ccsearch resume <id> # cd + claude -r, ready to go
That's it. The first ccsearch index downloads a ~130 MB embedding model and indexes ~/.claude/projects/. After that, searches take ~3 seconds (semantic) or ~0.2 seconds (grep).
Optional: conversation outlines
If you have ollama installed, you can generate timestamped outlines for each conversation:
ollama pull gemma2:9b
ccsearch summarize --daemon # runs in background, ~30s per conversation
ccsearch summarize --stop # pause anytime, resume with --daemon
ccsearch summarize --status # check progress
Outlines appear in search results once generated. Everything works fine without them.
Keep it up to date
ccsearch index # re-run manually (incremental, seconds)
ccsearch schedule --install # or auto-run daily (macOS LaunchAgent / Linux cron)
ccsearch schedule --uninstall # remove the schedule
Configuration
ccsearch config # see current settings
ccsearch config --edit # open config.toml in $EDITOR
Everything is in one TOML file, auto-generated on first run. Key settings:
| Setting | Default | What it does |
|---|---|---|
[embedding] model |
BAAI/bge-small-en-v1.5 |
Embedding model (384-dim, ~130 MB) |
[embedding] device |
auto |
auto picks mps/cuda/cpu |
[summarization] model |
gemma2:9b |
Ollama model for outlines |
[summarization] re_summarize_threshold |
10 |
Re-summarize after N new messages |
[search] top_n |
10 |
Results per query |
[schedule] hour |
2 |
Daily auto-index hour (0-23) |
To swap the embedding model, edit config and ccsearch index --rebuild. To swap the summary model, edit config and ccsearch summarize --daemon (picks up new model on next run).
All commands
| Command | What |
|---|---|
ccsearch index [--rebuild] |
Index conversations (incremental by default) |
ccsearch "query" [-n N] |
Semantic search |
ccsearch -g "text" [-n N] |
Literal grep search (no model, instant) |
ccsearch show ID [--query Q] [--context N] [--full] |
View conversation around a match |
ccsearch resume ID |
Resume a conversation in Claude Code |
ccsearch summarize --daemon/--stop/--status/--all |
Background outline generation via ollama |
ccsearch schedule --install/--uninstall |
Daily auto-index (macOS/Linux) |
ccsearch config [--edit] [--path] |
View/edit configuration |
ccsearch stats |
Index health: files, sessions, chunks, summaries |
Add --compact or --json to any search for one-line or machine-readable output.
How it works
- Parses
.jsonlfiles from~/.claude/projects/— keeps user/assistant text + tool output, drops tool calls, images, thinking blocks - Chunks long messages into ~1500-char windows with overlap
- Embeds with sentence-transformers (bge-small-en-v1.5, runs on MPS/CUDA/CPU)
- Stores chunks + 384-dim vectors in a single SQLite file using sqlite-vec
- Searches via KNN, groups hits by session, enriches with titles and outlines
- Grep mode (
-g) bypasses all of the above — justLIKE '%query%'on the chunks table
Tradeoffs
Good at: finding conversations by topic, resuming past work, exact identifier lookup (-g), staying local and private.
Less good at: non-English text (bge-small is English-biased), very large corpora (sqlite-vec caps KNN at k=4096 — fine for ~50K chunks, may need partitioning past ~250K), first-invocation speed (~3s to load torch).
Requires: Python 3.11+, ~300 MB disk (model + index). Ollama only needed for outlines (optional). Tested on macOS (Apple Silicon + Intel) and should work on Linux. Windows untested (daemon uses os.fork).
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file claude_code_search-0.1.0.tar.gz.
File metadata
- Download URL: claude_code_search-0.1.0.tar.gz
- Upload date:
- Size: 34.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9f26a29076b97ef71cbfb71dc4d8a6023e89035ff2b73f7fdd0915ad2576ec22
|
|
| MD5 |
7bbf4a9951d8b67e77d4c12213f7466f
|
|
| BLAKE2b-256 |
f27061d694907d8faec22fe5a665e0b7207f79627b92307f08ebae31200e6b83
|
File details
Details for the file claude_code_search-0.1.0-py3-none-any.whl.
File metadata
- Download URL: claude_code_search-0.1.0-py3-none-any.whl
- Upload date:
- Size: 31.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1dc9cd33b52df2fbe3ea79230db798dd95e71fcc2cc3ab0623f21efdd79c782
|
|
| MD5 |
8b3ba1b8dc769127a3342997faac5662
|
|
| BLAKE2b-256 |
a8b6c09c35a4f425f313496ca7e1a957329f5051727045c52d2a354653e44a25
|