Personal claude.ai conversation archive — ingest, search, enrichment, and MCP server
Project description
tinderbox-archive
A personal claude.ai conversation archive with hybrid search, Haiku-powered enrichment, and an MCP server for Claude Desktop / Claude Code.
Built to answer: "What did I talk to Claude about six months ago?"
What it does
- Ingests your claude.ai conversation export (ZIP) into a Supabase database — messages, artifacts, attachments
- Embeds every message with
mxbai-embed-largevia Ollama (1024d, stored in pgvector) - Searches using hybrid retrieval — cosine similarity + full-text, merged with RRF scoring
- Enriches each conversation with Claude Haiku: summary, topics, project tags, key decisions, named AI personas
- Serves everything over MCP so Claude Desktop or Claude Code can search your archive mid-conversation
Requirements
- Python 3.12+
- Supabase project with pgvector enabled
- Ollama running locally with
mxbai-embed-largepulled - Anthropic API key (for enrichment only — search works without it)
Installation
pip install tinderbox-archive
Or from source:
git clone https://github.com/luckyrmp/tinderbox-archive
cd tinderbox-archive/parser
pip install -e .
Setup
1. Supabase schema
Apply the migrations in migrations/ to your Supabase project. The schema is named tinderbox and must be exposed via PostgREST.
2. Environment
Create a .env file (default location: ~/.secrets/tinderbox.env):
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_KEY=your-service-role-key
ANTHROPIC_API_KEY=your-anthropic-key # enrichment only
Or set the variables directly in your shell. Point to a custom env file:
export TINDERBOX_ENV_FILE=/path/to/your.env
3. Pull your Ollama model
ollama pull mxbai-embed-large
Usage
Ingest a conversation export
Download your export from claude.ai (Settings → Export Data), then:
tinderbox ingest /path/to/conversations.zip
Embed messages
tinderbox embed
Search
tinderbox search "what did we decide about the database schema"
Enrich conversations
tinderbox enrich
This calls Claude Haiku once per conversation and writes structured annotations (summary, topics, project tags, key decisions, named AI personas) to Supabase.
MCP server (Claude Desktop / Claude Code)
Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"tinderbox": {
"command": "/path/to/tinderbox-archive/parser/scripts/tinderbox_mcp.sh"
}
}
}
Or if installed via pip, point directly at the module:
{
"mcpServers": {
"tinderbox": {
"command": "python3",
"args": ["-m", "tinderbox.mcp.server"],
"env": {
"TINDERBOX_ENV_FILE": "/path/to/your.env"
}
}
}
}
Two tools are exposed:
tinderbox_search— hybrid search returning top results with enrichment summariestinderbox_get_conversation— fetch a full conversation thread by export ID
CLI reference
tinderbox ingest <zip> Ingest a claude.ai export ZIP
tinderbox embed Generate embeddings for new messages
tinderbox search <query> Hybrid search (semantic + full-text)
tinderbox enrich Enrich conversations with Haiku annotations
tinderbox enrich --retry-failures Re-attempt previously failed enrichments
tinderbox runs list Show recent ingest runs
tinderbox named-clean Remove false-positive named instances
tinderbox staleness Check how stale the archive is
tinderbox qa run Run retrieval quality eval
Architecture
claude.ai export ZIP
↓
tinderbox ingest → Supabase: conversations, messages, artifacts
↓
tinderbox embed → Supabase: embeddings (pgvector, mxbai-embed-large 1024d)
↓
tinderbox enrich → Supabase: enrichment (Haiku annotations)
↓
tinderbox search → hybrid retrieval (cosine + FTS + RRF)
↓
MCP server → Claude Desktop / Claude Code tools
Supabase is accessed via the REST API (supabase-py). No direct Postgres connection required.
Design notes
- Memorial design: conversations are never deleted. Deleted-upstream conversations are tombstoned (
deleted_upstream=true) and remain searchable. - Mass-tombstone canary: ingest halts if more than 10% of active conversations would be tombstoned in a single run.
- Enrichment is opinion: Haiku annotations are surfaced as navigation aids, not ground truth. The original messages are always the source of truth.
- Cache layer: a SQLite read cache (740× speedup on repeated searches) wraps Supabase queries. Invalidated automatically on new ingest or enrichment.
License
Apache 2.0 — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tinderbox_archive-0.1.0.tar.gz.
File metadata
- Download URL: tinderbox_archive-0.1.0.tar.gz
- Upload date:
- Size: 63.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ca86b3c6416297267f2d6a4d8691e190e6c351b90b5b763a3059ea93dc4a333
|
|
| MD5 |
420ced327739e6880b2da560fcf88013
|
|
| BLAKE2b-256 |
1e18054e6c8912eb2e60b1df8b1227e94d0752a52254b0d68bcb50e45a99d4ec
|
File details
Details for the file tinderbox_archive-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tinderbox_archive-0.1.0-py3-none-any.whl
- Upload date:
- Size: 69.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
394a96a494f2fa962184ac7f61a2a6b4a6dbda06a4f2ed2d96d44dd692cbb8c7
|
|
| MD5 |
3ec38245347778057de2c7692394a44a
|
|
| BLAKE2b-256 |
ce45abfda7661ccf78da1d7f0e2e63aaf7a0dfaac7ba6c79f92a7b6edbc3b56d
|