A local-first archive, search, and intelligence engine for your LLM conversation history.
Project description
ricoeur
A local-first archive, search, and intelligence engine for your LLM conversation history.
Named after Paul Ricoeur, whose work on narrative identity argued that we understand ourselves through the stories we construct from our lived experience. ricoeur reconstructs the narrative of your intellectual life from thousands of AI conversations.
Quickstart
# Install with uv
uv sync
# Initialize the database
uv run ricoeur init
# Import your ChatGPT export
uv run ricoeur import chatgpt ~/Downloads/chatgpt-export/conversations.json
# Import your Claude export
uv run ricoeur import claude ~/Downloads/claude-export/conversations.json
# Build the intelligence layer (language detection, embeddings, analytics)
uv run ricoeur index
# See what you've got
uv run ricoeur stats
# Search your history
uv run ricoeur search "thermal simulation"
Commands
| Command | Description |
|---|---|
ricoeur init |
Initialize database and config at ~/.ricoeur/ |
ricoeur import chatgpt <path> |
Import from ChatGPT export (.json or .zip) |
ricoeur import claude <path> |
Import from Claude export (.json or .zip) |
ricoeur search <query> |
Search across all conversations (hybrid by default) |
ricoeur show <id> |
Display a conversation with formatting |
ricoeur stats |
Analytics dashboard |
ricoeur index |
Build intelligence layer (languages, embeddings, analytics) |
ricoeur config show |
Print current configuration |
ricoeur config set <key> <value> |
Update a config value |
Search
ricoeur supports three search modes:
| Mode | Flag | How it works |
|---|---|---|
| Hybrid | (default) | Combines keyword + semantic via Reciprocal Rank Fusion (RRF) |
| Keyword | --keyword |
FTS5 full-text search with BM25 ranking |
| Semantic | --semantic |
Cosine similarity against pre-computed embeddings |
When embeddings are available (after ricoeur index), search automatically uses hybrid mode. If no embeddings exist, it falls back to keyword search. Flags like --code or --role also force keyword mode since they rely on FTS5.
# Hybrid search (default — combines keyword + semantic)
ricoeur search "deployment strategies"
# Force keyword-only (FTS5)
ricoeur search "streamlit dashboard" --keyword
# Force semantic-only (cosine similarity)
ricoeur search "how to containerize apps" --semantic
# Filter by platform, language, date
ricoeur search "error fix" --platform chatgpt --lang en
ricoeur search "strategie marketing" --lang fr --since 2025-01-01
# Search only in code blocks (auto-uses keyword mode)
ricoeur search "import pandas" --code
# Output formats: table (default), json, full, ids
ricoeur search "MCP" --format json
Why semantic search?
Keyword search only finds exact word matches. Semantic search finds conceptually related conversations — even when the exact words don't appear.
Query: "how to containerize applications"
$ ricoeur search "how to containerize applications" --keyword
Found 3 results for "how to containerize applications" (keyword)
$ ricoeur search "how to containerize applications" --semantic
Found 20 results for "how to containerize applications" (semantic)
# Score Date Title
1 0.5515 2025-07-29 Free docker deployment options
2 0.5152 2026-01-28 Secure Clawdbot Setup
3 0.5081 2025-03-11 Docker noVNC Setup
4 0.5014 2025-09-21 HTML upload and serve
5 0.4782 2022-12-27 Wasm vs Container Comparison
...
Keyword found 3 results matching the literal words. Semantic found 20 — including Docker, Wasm, and container deployment conversations that never mention "containerize applications".
Index
After importing, build the intelligence layer:
# Run all layers: language detection, embeddings, analytics
ricoeur index
# Second run skips what's already cached
ricoeur index
# Force a full rebuild
ricoeur index --force
# Run specific layers only
ricoeur index --embeddings
ricoeur index --analytics
# Use a different embedding model
ricoeur index --embed-model ollama:nomic-embed-text
ricoeur index --embed-model st:all-MiniLM-L6-v2 --device cpu
Each layer requires its optional extra:
| Layer | Extra | What it does |
|---|---|---|
| Languages | langdetect |
Detects language per conversation (stored in DB) |
| Embeddings | embeddings |
Generates sentence-transformer vectors (~/.ricoeur/embeddings/) |
| Analytics | analytics |
Exports conversations & messages to Parquet (~/.ricoeur/analytics/) |
# Install all index dependencies at once
uv sync --extra langdetect --extra embeddings --extra analytics
Import options
# Re-import safely (updates existing, adds new, never deletes)
ricoeur import chatgpt conversations.json --update
# Dry run — parse and validate without writing
ricoeur import chatgpt conversations.json --dry-run
# Only import recent conversations
ricoeur import claude conversations.json --since 2025-01-01
Optional extras
Install additional capabilities as needed:
# Language detection
uv sync --extra langdetect
# Semantic search with sentence-transformers
uv sync --extra embeddings
# Topic modeling with BERTopic
uv sync --extra topics
# Analytics with DuckDB + Parquet
uv sync --extra analytics
# Terminal UI
uv sync --extra tui
# MCP server for Claude Desktop
uv sync --extra mcp
# Web API server
uv sync --extra serve
# Everything
uv sync --extra all
Configuration
Config lives at ~/.ricoeur/config.toml:
[general]
home = "~/.ricoeur"
default_language = "en"
[embeddings]
model = "st:paraphrase-multilingual-mpnet-base-v2"
batch_size = 64
device = "auto"
[topics]
min_cluster_size = 15
n_topics = "auto"
[summarize]
enabled = false
model = "ollama:llama3.2"
Override the data directory with the RICOEUR_HOME environment variable.
Architecture
~/.ricoeur/
├── config.toml # Configuration
├── ricoeur.db # SQLite database (FTS5 search)
├── analytics/ # Parquet files for DuckDB
├── embeddings/ # Sentence-transformer vectors
├── models/ # Saved BERTopic models
└── attachments/ # Extracted files
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ricoeur-0.3.3.tar.gz.
File metadata
- Download URL: ricoeur-0.3.3.tar.gz
- Upload date:
- Size: 244.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
133ba05e67282f44a5dbc8ea5959b5f7db5e6c8608719eb5675549d8ba43add7
|
|
| MD5 |
1ee8d0180ced5b8d460ec779836ddebb
|
|
| BLAKE2b-256 |
fe308c6c59eceb1f7a438a131aef3d449fdcd32f3d6861701b7049d832d430c4
|
Provenance
The following attestation bundles were made for ricoeur-0.3.3.tar.gz:
Publisher:
python-publish.yml on yanndebray/ricoeur
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ricoeur-0.3.3.tar.gz -
Subject digest:
133ba05e67282f44a5dbc8ea5959b5f7db5e6c8608719eb5675549d8ba43add7 - Sigstore transparency entry: 961867381
- Sigstore integration time:
-
Permalink:
yanndebray/ricoeur@db2d928abf4327a0e4bbdbf138a110e025096f32 -
Branch / Tag:
refs/tags/v0.3.3 - Owner: https://github.com/yanndebray
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@db2d928abf4327a0e4bbdbf138a110e025096f32 -
Trigger Event:
release
-
Statement type:
File details
Details for the file ricoeur-0.3.3-py3-none-any.whl.
File metadata
- Download URL: ricoeur-0.3.3-py3-none-any.whl
- Upload date:
- Size: 26.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a7808d19f9541380dc1526b122bf1215ef1c0be48d47cf3f32e85be788390b9
|
|
| MD5 |
92b8a3bc1a141691799ffe242d3ae498
|
|
| BLAKE2b-256 |
e293e5a4f9a6b6aeeb6f028cdbc8d8aa84cf3624db94372c12b4c29dd9d0b3fe
|
Provenance
The following attestation bundles were made for ricoeur-0.3.3-py3-none-any.whl:
Publisher:
python-publish.yml on yanndebray/ricoeur
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ricoeur-0.3.3-py3-none-any.whl -
Subject digest:
2a7808d19f9541380dc1526b122bf1215ef1c0be48d47cf3f32e85be788390b9 - Sigstore transparency entry: 961867524
- Sigstore integration time:
-
Permalink:
yanndebray/ricoeur@db2d928abf4327a0e4bbdbf138a110e025096f32 -
Branch / Tag:
refs/tags/v0.3.3 - Owner: https://github.com/yanndebray
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@db2d928abf4327a0e4bbdbf138a110e025096f32 -
Trigger Event:
release
-
Statement type: