Skip to main content

Semantic search and RAG for local AI coding agent conversations

Project description

Searchat

Semantic search and RAG-powered Q&A for AI coding agent conversations. Find past solutions by meaning, not just keywords, and ask questions about your conversation history.

Fork of: Process-Point-Technologies-Corporation/searchat

Supported Agents

Agent Location Format
Claude Code ~/.claude/projects/**/*.jsonl JSONL
Mistral Vibe ~/.vibe/logs/session/*.json JSON
OpenCode ~/.local/share/opencode/storage/session/*/*.json JSON
OpenAI Codex ~/.codex/sessions/**/rollout-*.jsonl JSONL
Gemini CLI ~/.gemini/tmp/<project_hash>/chats/*.json JSON
Continue ~/.continue/sessions/*.json JSON
Cursor .../Cursor/User/.../*.vscdb SQLite
Aider .aider.chat.history.md (set SEARCHAT_AIDER_PROJECT_DIRS) Markdown

Features

Core Search

  • Hybrid Search — DuckDB FTS keyword + FAISS semantic vectors with RRF fusion
  • Query Synonyms — Automatic expansion of common search terms
  • Cross-Encoder Re-ranking — Optional re-ranking with cross-encoder models
  • Multi-Agent — Search across Claude Code, Mistral Vibe, and OpenCode sessions
  • Tool Filters — Filter results by specific agent (Claude, Vibe, or OpenCode)
  • Autocomplete — Smart search suggestions as you type
  • Search History — Persistent search history with LocalStorage

AI-Powered Features

  • RAG Chat — Ask questions about your conversation history with AI-powered answers
  • Session Chat — Multi-turn RAG conversations with session persistence (30-min TTL)
  • Pattern Mining — Extract recurring coding patterns from conversation archives
  • Embedded LLM — Run RAG chat locally with a GGUF model (llama-cpp-python)
  • Semantic Highlights — Optional LLM-generated highlight terms for search results
  • Conversation Similarity — Discover related conversations using semantic similarity
  • Code Extraction — Extract and view code snippets with syntax highlighting

Organization & Management

  • Bookmarks — Save and annotate favorite conversations
  • Saved Queries — Save reusable searches (query + filters + mode)
  • Dashboards — Build dashboards from saved queries (widgets + auto-refresh)
  • Search Analytics — Track search patterns and usage statistics
  • Export — Export conversations in JSON/Markdown/Text (optional PDF + Jupyter notebook)
  • Bulk Export — Export multiple conversations at once
  • Agent Config Export — Generate CLAUDE.md, copilot-instructions, or cursorrules from patterns
  • Pagination — Navigate large result sets efficiently

Data Safety & Performance

  • Live Indexing — Auto-indexes new/modified files (5min debounce)
  • Append-Only — Never deletes existing data, safe for long-term use
  • Backups — Create and restore backups from UI or API
  • Snapshots — Browse backups as read-only datasets ("snapshot" mode)
  • Safe Shutdown — Detects ongoing indexing, prevents data corruption
  • DuckDB Storage — Efficient Parquet-based storage with fast queries
  • FAISS Vectors — High-performance semantic search

User Experience

  • Keyboard Shortcuts — Power user navigation and commands
  • Cross-Platform — Windows, WSL, Linux, macOS
  • Local-First — All data stays on your machine
  • Self-Search — Agents can search their own history via API
  • MCP Server — Let MCP clients (Claude Desktop, Cursor, etc.) query your local index
  • Terminal Resume — Resume conversations directly in terminal

📊 Interactive Documentation

Visual architecture diagrams and flow charts:

View in browser for interactive experience. See all infographics →

Quick Start

Install And Run (Standalone)

Install Searchat and build the initial index:

pip install searchat

# Optional: create ~/.searchat/config/settings.toml interactively
python -m searchat.setup

# Build the initial search index
searchat-setup-index

# Start the web server
searchat-web

Open http://localhost:8000

Install From Source

git clone https://github.com/Mathews-Tom/searchat.git
cd searchat
pip install -e .

# First-time setup: build search index
python scripts/setup-index

# Start web server
searchat-web

Open http://localhost:8000

The setup script indexes all conversations from supported agents. On subsequent runs, the web server automatically indexes new conversations via live file watching.

MCP Server (Claude Desktop, Cursor, ...)

pip install "searchat[mcp]"
searchat-mcp

See docs/mcp-setup.md for client configuration.

Available MCP tools:

  • search_conversations
  • get_conversation
  • find_similar_conversations
  • ask_about_history
  • list_projects
  • get_statistics
  • extract_patterns
  • generate_agent_config

Embedded LLM (Local GGUF)

Install the optional embedded dependency and download the default GGUF model:

pip install "searchat[embedded]"
searchat download-model --activate

When llm.default_provider = "embedded", the server will auto-download the default model if embedded_model_path is not set.

Enable Claude Self-Search

Add to ~/.claude/CLAUDE.md:

## Conversation History Search

Search past Claude Code conversations via local API (requires server running).

**Search:**

```bash
curl -s "http://localhost:8000/api/search?q=QUERY&limit=5" | jq '.results[] | {id: .conversation_id, title, snippet}'
```

Get full conversation:

curl -s "http://localhost:8000/api/conversation/CONVERSATION_ID" | jq '.messages[] | {role, content: .content[:500]}'

Ask questions (RAG):

curl -X POST "http://localhost:8000/api/chat" \
  -H "Content-Type: application/json" \
  -d '{"query": "How did we implement authentication?", "model_provider": "openai", "model_name": "gpt-4.1-mini"}'

Ask questions (RAG, embedded/local):

curl -X POST "http://localhost:8000/api/chat" \
  -H "Content-Type: application/json" \
  -d '{"query": "How did we implement authentication?", "model_provider": "embedded"}'

When to use:

  • User asks "did we discuss X before" or "find that conversation about Y"
  • Looking for previous solutions to similar problems
  • Checking how something was implemented in past sessions
  • Asking questions about past work (use RAG chat)

Start server: searchat-web from the searchat directory


See `CLAUDE.example.md` for the full template.

## Usage

### Web UI

```bash
searchat-web

Features:

  • Search Modes: hybrid/semantic/keyword with autocomplete
  • Filters: project, date range, tool (agent), similarity search
  • View Conversations: Full message history with code extraction
  • Bookmarks: Save and annotate favorite conversations
  • RAG Chat: Ask questions about your conversation history
  • Analytics: View search patterns and statistics
  • Saved Queries: Save and re-run complex searches
  • Dashboards: Create dashboards and widgets from saved queries
  • Export: Download conversations in multiple formats
  • Backups: Create and restore backups (left sidebar)
  • Snapshots: Switch between active index and read-only backup snapshots
  • Keyboard Shortcuts: Press ? to see all shortcuts
  • Terminal Resume: Resume conversations in terminal
  • Helpful Tips: Search tips + API integration sidebars

CLI

searchat  # interactive mode

# Download a default embedded GGUF model and update ~/.searchat/config/settings.toml
searchat download-model --activate

# Build the initial index (first-time setup)
searchat-setup-index

API

Search & Discovery

# Search
curl "http://localhost:8000/api/search?q=authentication&mode=hybrid&limit=10"

# Search with tool filter (claude, vibe, opencode)
curl "http://localhost:8000/api/search?q=authentication&tool=claude&limit=10"

# Autocomplete suggestions
curl "http://localhost:8000/api/search/suggestions?q=auth&limit=5"

# Find similar conversations
curl "http://localhost:8000/api/conversation/{conversation_id}/similar?limit=5"

# Optional: request highlight terms for the UI (LLM)
curl "http://localhost:8000/api/search?q=auth&mode=hybrid&highlight=true&highlight_provider=ollama"

# Search with pagination
curl "http://localhost:8000/api/search?q=api&limit=20&offset=0"

Conversations

# Get conversation
curl "http://localhost:8000/api/conversation/{conversation_id}"

# List all conversations
curl "http://localhost:8000/api/conversations/all?limit=50"

# Extract code snippets
curl "http://localhost:8000/api/conversation/{conversation_id}/code"

# Code highlighting (Pygments)
curl -X POST "http://localhost:8000/api/code/highlight" \
  -H "Content-Type: application/json" \
  -d '{"blocks":[{"code":"print(123)","language":"python","language_source":"fence"}]}'

# Conversation diff
curl "http://localhost:8000/api/conversation/{conversation_id}/diff?target_id={other_conversation_id}"

# Export conversation
curl "http://localhost:8000/api/conversation/{conversation_id}/export?format=markdown"

# Bulk export
curl -X POST "http://localhost:8000/api/conversations/bulk-export" \
  -H "Content-Type: application/json" \
  -d '{"conversation_ids": ["id1", "id2"], "format": "json"}'

# Resume in terminal
curl -X POST "http://localhost:8000/api/resume?conversation_id={id}"

Bookmarks

# List bookmarks
curl "http://localhost:8000/api/bookmarks"

# Add bookmark
curl -X POST "http://localhost:8000/api/bookmarks" \
  -H "Content-Type: application/json" \
  -d '{"conversation_id": "abc123", "notes": "Important auth solution"}'

# Remove bookmark
curl -X DELETE "http://localhost:8000/api/bookmarks/{conversation_id}"

RAG Chat

# Ask question about conversation history
curl -X POST "http://localhost:8000/api/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How did we implement authentication?",
    "model_provider": "openai",
    "model_name": "gpt-4.1-mini",
    "session_id": "optional-session-id"
  }'
# Response includes X-Session-Id header for session tracking

# Non-streaming RAG response with citations
curl -X POST "http://localhost:8000/api/chat-rag" \
  -H "Content-Type: application/json" \
  -d '{"query":"Summarize how backups work","model_provider":"ollama","model_name":"ollama/gemma3"}'

# Streaming response (default)
curl -X POST "http://localhost:8000/api/chat" \
  -H "Content-Type: application/json" \
  -d '{"query": "Explain the API design", "model_provider": "ollama", "model_name": "llama3"}' \
  --no-buffer

Analytics & Statistics

# Index statistics
curl "http://localhost:8000/api/statistics"

# Search analytics
curl "http://localhost:8000/api/stats/analytics/summary?days=7"

# Top queries / trends
curl "http://localhost:8000/api/stats/analytics/top-queries?limit=10&days=30"
curl "http://localhost:8000/api/stats/analytics/trends?days=30"

# List projects
curl "http://localhost:8000/api/projects"

# Project summary
curl "http://localhost:8000/api/projects/summary"

Saved Queries & Dashboards

# Saved queries
curl "http://localhost:8000/api/queries"

# Dashboards
curl "http://localhost:8000/api/dashboards"

Tech Docs (Optional)

Requires export.enable_tech_docs=true.

curl -X POST "http://localhost:8000/api/docs/summary" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Searchat Notes",
    "format": "markdown",
    "sections": [
      {"name": "Backups", "query": "backup restore", "mode": "hybrid", "filters": {"project": "myapp"}}
    ]
  }'

Snapshots (Read-only)

# Search within a backup snapshot
curl "http://localhost:8000/api/search?q=auth&snapshot=backup_YYYYMMDD_HHMMSS"

Indexing & Management

# Watcher status
curl "http://localhost:8000/api/watcher/status"

# Index missing conversations (append-only)
curl -X POST "http://localhost:8000/api/index_missing"

# Safe shutdown (checks for ongoing indexing)
curl -X POST "http://localhost:8000/api/shutdown"

# Force shutdown (override safety check)
curl -X POST "http://localhost:8000/api/shutdown?force=true"

Backups

# Create full backup (plaintext)
curl -X POST "http://localhost:8000/api/backup/create"

# Create full backup (encrypted; backups only)
# Requires: SEARCHAT_BACKUP_KEY_B64 (base64-encoded 32 bytes)
curl -X POST "http://localhost:8000/api/backup/create?encrypted=true"

# Create incremental backup (delta) based on a parent backup
curl -X POST "http://localhost:8000/api/backup/incremental/create?parent=backup_YYYYMMDD_HHMMSS"

# Create encrypted incremental backup (parent must also be encrypted)
curl -X POST "http://localhost:8000/api/backup/incremental/create?parent=backup_YYYYMMDD_HHMMSS&encrypted=true"

# List backups
curl "http://localhost:8000/api/backup/list"

# Validate a backup chain and (optionally) verify hashes
curl "http://localhost:8000/api/backup/validate/backup_YYYYMMDD_HHMMSS?verify_hashes=true"

# Inspect ancestry chain (base -> target)
curl "http://localhost:8000/api/backup/chain/backup_YYYYMMDD_HHMMSS"

# Restore backup
curl -X POST "http://localhost:8000/api/backup/restore?backup_name=backup_YYYYMMDD_HHMMSS"

# Delete backup
curl -X DELETE "http://localhost:8000/api/backup/delete/backup_YYYYMMDD_HHMMSS"

Notes:

  • Snapshot browsing (?snapshot=...) only supports snapshot-browsable backups (plaintext full backups). Incremental and encrypted backups are restore-only.
  • Incremental backups support chained deltas, with a max total chain length of 10.
  • Generate a new encryption key (example): python -c "import os,base64; print(base64.b64encode(os.urandom(32)).decode())"

Utilities

# Add missing conversations to index
python scripts/index-missing

# Initial setup (interactive, safe options)
python scripts/setup-index

# Convert Vibe plaintext history to searchable sessions
python utils/vibe_converter.py

As Library

from searchat.core.search_engine import SearchEngine
from searchat.config.settings import Config

config = Config.load()
engine = SearchEngine(config.paths.search_directory, config)

results = engine.search("python async", mode="hybrid")
for r in results.results[:5]:
    print(f"{r.title}: {r.score:.3f}")

Architecture

Code Organization:

  • src/searchat/api/ - FastAPI app with 14 modular routers (50+ endpoints)
  • src/searchat/core/ - Core indexing and search logic
  • src/searchat/services/ - Business services (chat, bookmarks, analytics, backup)
  • src/searchat/web/ - Modular frontend (HTML + CSS modules + ES6 JS)
  • tests/ - Comprehensive test suite (840+ tests)

Data Flow:

~/.claude/projects/**/*.jsonl     (source conversations)
~/.vibe/logs/session/*.json
~/.local/share/opencode/.../*.json
        │
        ▼ index_append_only()
        │
~/.searchat/data/
├── conversations/*.parquet       (DuckDB queryable)
└── indices/
    ├── embeddings.faiss          (semantic vectors)
    ├── embeddings.metadata.parquet
    └── index_metadata.json

Search Flow:

  1. Query → DuckDB FTS keyword search + FAISS semantic search
  2. Results merged via Reciprocal Rank Fusion
  3. Hybrid ranking returns best of both approaches
  4. Optional: Cross-encoder re-ranking of top results
  5. Query synonym expansion for keyword matching
  6. Optional: Find similar conversations via vector similarity

RAG Flow:

  1. User question → Search for relevant conversations
  2. Top results used as context
  3. LLM generates answer with conversation references
  4. Streaming response to client

Live Watching:

  • watchdog monitors conversation directories
  • New files → indexed immediately
  • Modified files → re-indexed after 5min debounce (configurable)
  • index_append_only() adds to existing index
  • Never deletes existing data

Documentation:

  • docs/features.md - Complete feature list and descriptions
  • docs/architecture.md - System design and components
  • docs/api-reference.md - Complete API endpoint documentation
  • docs/terminal-launching.md - Platform-specific terminal launching

Configuration

Create ~/.searchat/config/settings.toml:

[paths]
search_directory = "~/.searchat"
claude_directory_windows = "~/.claude/projects"
claude_directory_wsl = "//wsl$/Ubuntu/home/{username}/.claude/projects"

[indexing]
batch_size = 1000
auto_index = true
reindex_on_modification = true  # Re-index modified conversations
modification_debounce_minutes = 5  # Wait time before re-indexing
enable_connectors = true
enable_adaptive_indexing = true

[search]
default_mode = "hybrid"
max_results = 100
snippet_length = 200

[embedding]
model = "all-MiniLM-L6-v2"
batch_size = 32
device = "auto"  # auto|cuda|mps|cpu

[llm]
default_provider = "ollama"
openai_model = "gpt-4.1-mini"
ollama_model = "ollama/gemma3"

# Embedded (local GGUF via llama-cpp-python)
embedded_model_path = ""
embedded_n_ctx = 4096
embedded_n_threads = 0
embedded_auto_download = true
embedded_default_preset = "qwen2.5-coder-1.5b-instruct-q4_k_m"

[chat]
enable_rag = true
enable_citations = true

[analytics]
enabled = false
retention_days = 30

[export]
enable_ipynb = false
enable_pdf = false
enable_tech_docs = false

[dashboards]
enabled = true

[snapshots]
enabled = true

[performance]
memory_limit_mb = 3000
query_cache_size = 100

[reranking]
enabled = false
model = "cross-encoder/ms-marco-MiniLM-L-6-v2"
top_k = 50

[server]
cors_origins = ["http://localhost:8000", "http://127.0.0.1:8000"]

Or use environment variables:

export SEARCHAT_DATA_DIR=~/.searchat
export SEARCHAT_PORT=8000
export SEARCHAT_EMBEDDING_MODEL=all-MiniLM-L6-v2
export SEARCHAT_REINDEX_ON_MODIFICATION=true
export SEARCHAT_MODIFICATION_DEBOUNCE_MINUTES=5
export SEARCHAT_OPENCODE_DATA_DIR=~/.local/share/opencode
export OPENAI_API_KEY=sk-...  # For RAG chat
export OLLAMA_BASE_URL=http://localhost:11434

Requirements

  • Python 3.10+
  • ~2-3GB RAM (embeddings model + FAISS index)
  • ~10MB disk per 1K conversations
  • Optional: OpenAI API key or Ollama for RAG chat

Dependencies

Package Purpose
sentence-transformers Embeddings (all-MiniLM-L6-v2)
faiss-cpu Vector similarity search
pyarrow Parquet storage
duckdb SQL queries on parquet
fastapi + uvicorn Web API
watchdog File system monitoring
litellm Multi-provider LLM interface
rich CLI formatting

Safety

Append-only indexing: Never deletes existing data.

indexer.index_append_only(file_paths)  # Safe: only adds new data
indexer.index_all()                     # Blocked if index exists
indexer.index_all(force=True)           # Explicit override required

Safe shutdown: Detects ongoing indexing operations.

# Check status, wait if indexing in progress
curl -X POST "http://localhost:8000/api/shutdown"

# Override safety check (may corrupt data)
curl -X POST "http://localhost:8000/api/shutdown?force=true"

Backups: Create backups before risky operations.

# Automatic pre-restore backup before any restore operation
curl -X POST "http://localhost:8000/api/backup/restore" -d '{"name": "backup_20260129_120000"}'

Protects against:

  • Data loss from deleted/moved source files
  • Corrupted Parquet/FAISS files during indexing
  • Inconsistent metadata from interrupted operations

Performance

Metric Value
Search latency <100ms (hybrid), <50ms (semantic), <30ms (keyword)
Filtered queries <20ms (DuckDB predicate pushdown)
Index build ~60s per 100K conversations
Embedding Batched (CPU: 0.1s/conv, GPU: 0.008s/conv)
Memory ~2-3GB
Startup <3s
RAG chat response <2s (with OpenAI), <5s (with Ollama)
Code extraction <50ms per conversation
Similarity search <100ms (FAISS nearest neighbors)

Testing

pytest                          # Run the full test suite
pytest tests/api/              # Run API tests only
pytest -v                      # Verbose output
pytest -k test_search          # Run specific tests
pytest --cov=searchat          # Coverage report
pytest --cov-report=html       # HTML coverage report

Test Coverage:

  • 840+ tests (API, UI contract tests, unit tests, perf gates)
  • ~5,900 lines of test code
  • Comprehensive coverage of all features
  • API endpoint tests, unit tests, integration tests

Troubleshooting

Port in use:

SEARCHAT_PORT=8001 searchat-web

No conversations found:

ls ~/.claude/projects/  # Verify conversations exist

WSL not tracked: Configure claude_directory_wsl in ~/.searchat/config/settings.toml:

claude_directory_wsl = "//wsl.localhost/Ubuntu/home/username/.claude/projects"

Missing conversations after setup:

python scripts/index-missing  # Index files not yet in search index

Slow on WSL: Run from Windows Python or move repo to WSL filesystem (~/projects/).

Import errors:

pip install -e . --force-reinstall

Empty environment variables override config: Remove empty values from ~/.searchat/config/.env or set proper values.

RAG chat not working:

# For OpenAI
export OPENAI_API_KEY=sk-...

# For Ollama (local)
ollama serve  # Start Ollama server
export OLLAMA_BASE_URL=http://localhost:11434

Fork Enhancements

This fork adds significant new features beyond the original:

  • RAG Chat - AI-powered Q&A over conversation history
  • Bookmarks System - Save and organize favorite conversations
  • Search Analytics - Track and analyze search patterns
  • Conversation Similarity - Discover related conversations
  • Code Extraction - Extract code snippets with syntax highlighting
  • Saved Queries - Reusable searches with stored filters
  • Dashboards - Builder UI + widgets rendered from saved queries
  • Snapshots - Browse backups as read-only datasets
  • Export Features - JSON/Markdown/Text exports (optional PDF + Jupyter)
  • Bulk Export - Export multiple conversations at once
  • Pagination - Efficient navigation of large result sets
  • Autocomplete - Smart search suggestions
  • Search History - Persistent search history
  • Keyboard Shortcuts - Power user shortcuts
  • OpenCode Support - Added third agent support
  • Tool Filtering - Filter by specific agent
  • Modern Typing - Python 3.12 type hints throughout
  • Session Chat — Multi-turn RAG with session persistence
  • Pattern Mining — Extract recurring patterns via LLM
  • Agent Config Export — Generate agent config from patterns
  • DuckDB FTS — Replaced BM25 with DuckDB full-text search
  • Cross-Encoder Re-ranking — Optional result re-ranking
  • Query Synonyms — Automatic synonym expansion
  • CORS Hardening — Configurable CORS origins (default localhost)

See docs/features.md for complete feature documentation.

Contributing

See CONTRIBUTING.md for guidelines.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

searchat-0.6.1.tar.gz (359.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

searchat-0.6.1-py3-none-any.whl (261.3 kB view details)

Uploaded Python 3

File details

Details for the file searchat-0.6.1.tar.gz.

File metadata

  • Download URL: searchat-0.6.1.tar.gz
  • Upload date:
  • Size: 359.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for searchat-0.6.1.tar.gz
Algorithm Hash digest
SHA256 e2e37778d5c6e9745f60b1a99f69b72c23cafc78ad17bf5ee54dcef86cb875d5
MD5 1758d8f04fa427cfc82c2a474907b909
BLAKE2b-256 7a99a948d051fedbc809ce07a464d3b3aac10d7c777a81733a3b84aeebc96fc2

See more details on using hashes here.

File details

Details for the file searchat-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: searchat-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 261.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for searchat-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3b899750feb3a730d27eddbaebf8581b81adcf8ccd762edd918a8741747e62d2
MD5 57f247dab36464c8e455d90d7f4dab25
BLAKE2b-256 d9e40f397da3ebfb84349e77aab31608be4e5a629adad3577f77194790e3f7e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page