Ingest Claude AI conversation exports into a local ChromaDB vector database for semantic search
Project description
claude-chroma
A Python CLI tool that ingests Claude AI conversation exports into a local ChromaDB vector database, making your entire conversation history semantically searchable. Pair it with the official chroma-mcp server to give Claude Desktop or Claude Code access to everything you've ever discussed.
Prerequisites
- Python 3.11+
- uv
Quickstart
# Clone and install
git clone https://github.com/danwahl/claude-chroma.git
cd claude-chroma
uv sync
# Drop your Claude export(s) into claude_data/
cp ~/Downloads/conversations.json claude_data/
# Ingest into ChromaDB
uv run claude-chroma ingest
# Verify
uv run claude-chroma stats
# Test a search
uv run claude-chroma search "free will compatibilism"
Connecting to Claude Desktop / Claude Code
Add the official chroma-mcp server to your claude_desktop_config.json:
{
"mcpServers": {
"chroma": {
"command": "uvx",
"args": [
"chroma-mcp",
"--client-type", "persistent",
"--data-dir", "/absolute/path/to/claude-chroma/chroma_data"
]
}
}
}
Claude will then be able to query your full conversation history via the chroma MCP tool.
CLI Reference
claude-chroma ingest
Ingest all .json exports from the Claude data directory (including subdirectories) into ChromaDB.
Options:
--claude-dir PATH Directory containing JSON exports [default: ./claude_data]
--chroma-dir PATH ChromaDB storage directory [default: ./chroma_data]
claude-chroma stats
Show database statistics: total conversations, chunks, date range, and top 10 most-chunked conversations.
Options:
--chroma-dir PATH ChromaDB storage directory [default: ./chroma_data]
claude-chroma search
Run a semantic similarity search for development and debugging.
Arguments:
QUERY Search query text
Options:
-n, --n-results Number of results [default: 5]
--chroma-dir PATH ChromaDB storage directory [default: ./chroma_data]
Chunking Strategy
Conversations are chunked at the exchange level — each human message is paired with the subsequent assistant response. This preserves the question + answer arc as a single semantic unit. Long assistant responses (>2000 characters) are split with overlap, with the human message prepended as context to each sub-chunk.
Chunk IDs are deterministic ({conversation_uuid}:{turn_index}), so re-ingesting the same export is a no-op, and updated exports overwrite stale data.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file claude_chroma-0.1.1.tar.gz.
File metadata
- Download URL: claude_chroma-0.1.1.tar.gz
- Upload date:
- Size: 7.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76b8a5a040fff96bfdc824f769b009e03a04e01dfae64f39117f08c652ca9451
|
|
| MD5 |
1b468d28fe9831e388ab947d10bbc632
|
|
| BLAKE2b-256 |
ce7ba4700cdc93a03bbdb87b172a68df3faeaa5d10a9f204ce0ec3cd58c8833d
|
File details
Details for the file claude_chroma-0.1.1-py3-none-any.whl.
File metadata
- Download URL: claude_chroma-0.1.1-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c6b5442fd7943ba6c753a56b1fee25f9774c36835930f3ea0032066e0b759bc
|
|
| MD5 |
b620c58629b6d2fbfa1353d13a080845
|
|
| BLAKE2b-256 |
5abe1c185c225709df1b600a48c74bc226424affe6cdd87af05c440ec5211743
|