Universal knowledge base with Qdrant for Claude Code integration

These details have not been verified by PyPI

Project links

Project description

Claude KB

Hybrid semantic + keyword search over Claude Code conversation history, exposed as a CLI and an MCP server.

A personal research surface for retrieval-quality work over Claude Code conversation history. The aim is to make hybrid retrieval over a developer's own chat archive measurable and improvable, not to be production infrastructure. See Retrieval Evaluation for the harness and the methodology used to assess it; see Architecture for how the pieces fit together.

Background
Install
Usage
Architecture
Retrieval Evaluation
Chunking
Configuration
Development
Security
API
Maintainer
Contributing
License

Background

Claude Code already records every session as JSONL under ~/.claude/projects/. That archive grows fast and becomes hard to search with grep alone, especially across projects. Claude KB pipes that archive into a Qdrant collection with hybrid (dense + sparse) retrieval and exposes it back to Claude Code as an MCP server, so the agent can search its own history without leaving the editor.

Scope and non-goals:

Scope: a measurable retrieval surface over one developer's local Claude Code archive. Optimised for a single laptop and a local Qdrant instance.
Non-goals: multi-tenant deployment, hosted SaaS, ingesting non-Claude-Code corpora, replacing a general-purpose RAG framework.
Status: alpha. The author uses it daily; assume rough edges and expect to read source.

Install

Prerequisites: Python 3.13+, uv, Docker (for the local Qdrant instance), Claude Code (for the MCP integration).

# 1. Install the CLI
uv tool install claude-kb

# 2. Start a local Qdrant
docker compose up -d

# 3. Import your Claude Code conversation history
kb import-claude-code-chats

# 4. Register the MCP server with Claude Code
claude mcp add -s user kb -- kb mcp

After step 4, Claude Code has access to two tools, kb_search and kb_get, against your imported history. The first import re-embeds every message and may take several minutes; subsequent imports are incremental and only embed new messages.

Updating

uv tool upgrade claude-kb
kb --version

Usage

CLI

kb search "recency boost implementation"
kb search "error handling" --project claude-kb --from 2026-01-01 --limit 5
kb get <message-uuid>
kb get-thread <message-uuid> --depth 3
kb status
kb ai           # LLM-optimized command schema

Full flag list per command: kb <command> --help.

MCP

Once registered with claude mcp add, Claude Code can call:

kb_search(query, ...) - hybrid search; optional filters for project, conversation, role, date range, score threshold; optional grouping by conversation.
kb_get(message_id | conversation_id, ...) - retrieve a single message, a thread context, or restore a full conversation transcript.

Streamable HTTP transport is also supported:

kb mcp --transport http --port 3000

See docs/mcp-api.md for the full schema reference.

Architecture

~/.claude/projects/*/<session>.jsonl
    -> parse  (import_claude.py)
    -> classify content_type (prose/tool_use/tool_result/thinking/mixed)
    -> embed  dense BGE-base 768d
    -> Qdrant collection: conversations_hybrid
    -> retrieve  query_points(dense)
       + server-side filters: project, conversation_id, role, date range,
         primary_content_type (default-deny on tool_result + thinking)
       + score_threshold
    -> post-process  recency boost / compact / grouping
    -> CLI (kb ...)  |  MCP server (kb_search, kb_get)

One Qdrant point per Claude Code message; no sub-message chunking. Dense retrieval uses BAAI/bge-base-en-v1.5 (768d, L2-normalised). Recent messages are boosted post-retrieval with +0.2 * exp(-age / 1 week). Tool-result and thinking blocks are excluded from search results by default - both are dominant noise sources in code-conversation corpora; users opt in via include_tool_results=True / include_thinking=True when needed.

The collection schema also reserves a sparse vector slot, but the production search path is dense-only. The eval (docs/retrieval-experiments-2026-05.md) showed every hybrid configuration tested (BM25 fusion, bge-m3, Qwen3-Embedding-8B) regresses Recall@10 by 0.075-0.22 on this corpus shape; sparse vectors are stored only to keep the door open for future experiments.

Full diagram and per-stage notes: docs/architecture.md.

Retrieval Evaluation

Measured on the maintainer's corpus (~690k messages, 20 hand-graded queries across five categories, conversation-level grading with cross-phrasing to defeat the selection bias of self-grading). At --min-score 0.0, k=10:

Mode	Recall@10	MRR@10
dense-only, content-type filter on (default)	0.368	0.397
dense-only, recency boost on	0.368	0.440
dense-only, filter off	0.361	0.389
hybrid (RRF of dense + BM25)	regressed -0.075 vs dense-only on the 28-query expanded test	—
sparse-only (BM25)	regressed -0.18 vs dense-only on the 28-query expanded test	—

Five hybrid- and encoder-replacement experiments were tested across this work (BM25 hyperparameter tuning, RRF prefetch pool size, bge-m3 dense, bge-m3 sparse, Qwen3-Embedding-8B). All but one (RRF prefetch_factor=30, +0.024 MRR) regressed Recall@10 by 0.075-0.22 versus dense-only BGE-base. The corpus shape - short-form English code-conversation messages, ~47 words/doc median - is the constraint, not the encoder. Full table, methodology, and per-experiment failure analysis: docs/retrieval-experiments-2026-05.md.

The most measurable improvement of the work was a server-side filter that excludes tool_result and thinking blocks from search results by default (controlled by primary_content_type payload tag). It lifted dense-only Recall@10 from 0.352 → 0.502 in earlier iteration on a related corpus split. The current 20-query eval is published in docs/evaluation.md with per-query and per-category breakdowns.

Harness: scripts/run_eval.py; query set: tests/eval/queries.jsonl. The harness will not fabricate metrics; if queries are ungraded it prints ungraded, N queries pending and exits 0.

Adjacent measurements: MCP response token reduction (29% mean / 86% peak from compact mode, see CHANGELOG.md), restore-mode unit tests (tests/test_search_service.py), content-type classifier tests (tests/test_content_type.py).

Chunking

One Claude Code message, one Qdrant point. No sub-message chunking. The choice is load-bearing for the rest of the design (point IDs are message UUIDs, kb_get round-trips with kb_search), and it accepts known tradeoffs (SPLADE input truncated to 8000 chars per message; long-form prose recall is weaker than a sliding-window approach would deliver).

Why this is the right unit, what we lose, alternatives considered, and when to revisit: docs/chunking.md.

Configuration

Environment variables (or a .env file in the working directory):

Variable	Default	Purpose
`QDRANT_URL`	`http://localhost:6333`	Qdrant endpoint. Override for remote clusters.
`QDRANT_API_KEY`	unset	API key for Qdrant Cloud.
`EMBEDDING_MODEL`	`BAAI/bge-base-en-v1.5`	HuggingFace model name for the dense encoder.

Apple Silicon (MPS), CUDA, and CPU are auto-detected by sentence-transformers. The dense encoder is the production retrieval signal; the collection schema reserves a sparse vector slot but the production search path does not query it (see Retrieval Evaluation for why).

Development

git clone https://github.com/tenequm/claude-kb.git
cd claude-kb
uv sync --extra dev
just check        # ty type-check + ruff lint + format
uv run pytest -q  # unit tests

Pre-commit is configured via .pre-commit-config.yaml (ruff, secrets scan, basic hygiene).

Security

Vulnerability reporting policy: SECURITY.md. The MCP server binds to 127.0.0.1 by default, queries a local Qdrant instance only, and exposes only read operations.

API

The MCP server exposes two tools. Both are read-only, idempotent, and run entirely against a local Qdrant instance.

Tool	Purpose	Key parameters
`kb_search`	Hybrid semantic + keyword search across all imported messages.	`query`, `limit`, `project`, `conversation_id`, `role`, `from_date`, `to_date`, `min_score`, `boost_recent`, `group_by_conversation`
`kb_get`	Retrieve a single message, a thread context, or restore a full conversation transcript.	`message_id`, `conversation_id`, `up_to`, `context_depth`, `max_messages`

Output models, filter application order, error modes, and non-obvious filter semantics are documented in docs/mcp-api.md. Pydantic models live in src/claude_kb/models.py.

Maintainer

Misha Kolesnik - @tenequm - misha@kolesnik.io

Contributing

Issues and PRs are welcome at https://github.com/tenequm/claude-kb. Commit messages follow Conventional Commits (feat:, fix:, docs:, chore:, refactor:, test:). Please run just check and uv run pytest -q before opening a PR.

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.8.3

May 7, 2026

0.8.2

May 7, 2026

0.8.1

May 7, 2026

This version

0.8.0

May 7, 2026

0.7.4

Mar 23, 2026

0.7.3

Mar 18, 2026

0.7.2

Mar 18, 2026

0.7.1

Mar 18, 2026

0.7.0

Mar 18, 2026

0.6.0

Feb 19, 2026

0.5.0

Jan 6, 2026

0.4.0

Jan 5, 2026

0.3.1

Dec 16, 2025

0.3.0

Dec 11, 2025

0.2.2

Dec 9, 2025

0.2.1

Dec 9, 2025

0.2.0

Dec 9, 2025

0.1.1

Nov 19, 2025

0.1.0

Nov 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claude_kb-0.8.0.tar.gz (163.7 kB view details)

Uploaded May 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

claude_kb-0.8.0-py3-none-any.whl (45.2 kB view details)

Uploaded May 7, 2026 Python 3

File details

Details for the file claude_kb-0.8.0.tar.gz.

File metadata

Download URL: claude_kb-0.8.0.tar.gz
Upload date: May 7, 2026
Size: 163.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.11 {"installer":{"name":"uv","version":"0.11.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for claude_kb-0.8.0.tar.gz
Algorithm	Hash digest
SHA256	`9fbd6af4a06780c3ebf2e35547eff84b2f09c11a3f148d08dac5544a9f450114`
MD5	`cc66a5be488429889a4c88d425c5afe7`
BLAKE2b-256	`768a389104429a58b6bfad9fae0dbeace58b53632745f6042207d7f86cb956be`

See more details on using hashes here.

File details

Details for the file claude_kb-0.8.0-py3-none-any.whl.

File metadata

Download URL: claude_kb-0.8.0-py3-none-any.whl
Upload date: May 7, 2026
Size: 45.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.11 {"installer":{"name":"uv","version":"0.11.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for claude_kb-0.8.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f973f3c16a000d8bb87e5dab9d0f17a85977365833fc8a11b491adb9c5d67bee`
MD5	`fddda32a16273a5f918dcfef102a9c32`
BLAKE2b-256	`bdfa07f7533c644ac46f698a1d89527127fd4a3fb7fcae1cb27030e2875655d6`

See more details on using hashes here.

claude-kb 0.8.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Claude KB

Table of Contents

Background

Install

Updating

Usage

CLI

MCP

Architecture

Retrieval Evaluation

Chunking

Configuration

Development

Security

API

Maintainer

Contributing

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes