Skip to main content

A local-first archive, search, and intelligence engine for your LLM conversation history.

Project description

ricoeur

A local-first archive, search, and intelligence engine for your LLM conversation history.

Named after Paul Ricoeur, whose work on narrative identity argued that we understand ourselves through the stories we construct from our lived experience. ricoeur reconstructs the narrative of your intellectual life from thousands of AI conversations.

Quickstart

# Install with uv
uv sync

# Initialize the database
uv run ricoeur init

# Import your ChatGPT export
uv run ricoeur import chatgpt ~/Downloads/chatgpt-export/conversations.json

# Import your Claude export
uv run ricoeur import claude ~/Downloads/claude-export/conversations.json

# Build the intelligence layer (language detection, embeddings, analytics)
uv run ricoeur index

# See what you've got
uv run ricoeur stats

# Search your history
uv run ricoeur search "thermal simulation"

Commands

Command Description
ricoeur init Initialize database and config at ~/.ricoeur/
ricoeur import chatgpt <path> Import from ChatGPT export (.json or .zip)
ricoeur import claude <path> Import from Claude export (.json or .zip)
ricoeur search <query> Search across all conversations (hybrid by default)
ricoeur show <id> Display a conversation with formatting
ricoeur stats Analytics dashboard
ricoeur index Build intelligence layer (languages, embeddings, analytics)
ricoeur config show Print current configuration
ricoeur config set <key> <value> Update a config value

Search

ricoeur supports three search modes:

Mode Flag How it works
Hybrid (default) Combines keyword + semantic via Reciprocal Rank Fusion (RRF)
Keyword --keyword FTS5 full-text search with BM25 ranking
Semantic --semantic Cosine similarity against pre-computed embeddings

When embeddings are available (after ricoeur index), search automatically uses hybrid mode. If no embeddings exist, it falls back to keyword search. Flags like --code or --role also force keyword mode since they rely on FTS5.

# Hybrid search (default — combines keyword + semantic)
ricoeur search "deployment strategies"

# Force keyword-only (FTS5)
ricoeur search "streamlit dashboard" --keyword

# Force semantic-only (cosine similarity)
ricoeur search "how to containerize apps" --semantic

# Filter by platform, language, date
ricoeur search "error fix" --platform chatgpt --lang en
ricoeur search "strategie marketing" --lang fr --since 2025-01-01

# Search only in code blocks (auto-uses keyword mode)
ricoeur search "import pandas" --code

# Output formats: table (default), json, full, ids
ricoeur search "MCP" --format json

Why semantic search?

Keyword search only finds exact word matches. Semantic search finds conceptually related conversations — even when the exact words don't appear.

Query: "how to containerize applications"

$ ricoeur search "how to containerize applications" --keyword
Found 3 results for "how to containerize applications" (keyword)

$ ricoeur search "how to containerize applications" --semantic
Found 20 results for "how to containerize applications" (semantic)
 #   Score    Date        Title
 1   0.5515   2025-07-29  Free docker deployment options
 2   0.5152   2026-01-28  Secure Clawdbot Setup
 3   0.5081   2025-03-11  Docker noVNC Setup
 4   0.5014   2025-09-21  HTML upload and serve
 5   0.4782   2022-12-27  Wasm vs Container Comparison
 ...

Keyword found 3 results matching the literal words. Semantic found 20 — including Docker, Wasm, and container deployment conversations that never mention "containerize applications".

Index

After importing, build the intelligence layer:

# Run all layers: language detection, embeddings, analytics
ricoeur index

# Second run skips what's already cached
ricoeur index

# Force a full rebuild
ricoeur index --force

# Run specific layers only
ricoeur index --embeddings
ricoeur index --analytics

# Use a different embedding model
ricoeur index --embed-model ollama:nomic-embed-text
ricoeur index --embed-model st:all-MiniLM-L6-v2 --device cpu

Each layer requires its optional extra:

Layer Extra What it does
Languages langdetect Detects language per conversation (stored in DB)
Embeddings embeddings Generates sentence-transformer vectors (~/.ricoeur/embeddings/)
Analytics analytics Exports conversations & messages to Parquet (~/.ricoeur/analytics/)
# Install all index dependencies at once
uv sync --extra langdetect --extra embeddings --extra analytics

Import options

# Re-import safely (updates existing, adds new, never deletes)
ricoeur import chatgpt conversations.json --update

# Dry run — parse and validate without writing
ricoeur import chatgpt conversations.json --dry-run

# Only import recent conversations
ricoeur import claude conversations.json --since 2025-01-01

Optional extras

Install additional capabilities as needed:

# Language detection
uv sync --extra langdetect

# Semantic search with sentence-transformers
uv sync --extra embeddings

# Topic modeling with BERTopic
uv sync --extra topics

# Analytics with DuckDB + Parquet
uv sync --extra analytics

# Terminal UI
uv sync --extra tui

# MCP server for Claude Desktop
uv sync --extra mcp

# Web API server
uv sync --extra serve

# Everything
uv sync --extra all

Configuration

Config lives at ~/.ricoeur/config.toml:

[general]
home = "~/.ricoeur"
default_language = "en"

[embeddings]
model = "st:paraphrase-multilingual-mpnet-base-v2"
batch_size = 64
device = "auto"

[topics]
min_cluster_size = 15
n_topics = "auto"

[summarize]
enabled = false
model = "ollama:llama3.2"

Override the data directory with the RICOEUR_HOME environment variable.

Architecture

~/.ricoeur/
├── config.toml          # Configuration
├── ricoeur.db           # SQLite database (FTS5 search)
├── analytics/           # Parquet files for DuckDB
├── embeddings/          # Sentence-transformer vectors
├── models/              # Saved BERTopic models
└── attachments/         # Extracted files

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ricoeur-0.3.3.tar.gz (244.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ricoeur-0.3.3-py3-none-any.whl (26.3 kB view details)

Uploaded Python 3

File details

Details for the file ricoeur-0.3.3.tar.gz.

File metadata

  • Download URL: ricoeur-0.3.3.tar.gz
  • Upload date:
  • Size: 244.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ricoeur-0.3.3.tar.gz
Algorithm Hash digest
SHA256 133ba05e67282f44a5dbc8ea5959b5f7db5e6c8608719eb5675549d8ba43add7
MD5 1ee8d0180ced5b8d460ec779836ddebb
BLAKE2b-256 fe308c6c59eceb1f7a438a131aef3d449fdcd32f3d6861701b7049d832d430c4

See more details on using hashes here.

Provenance

The following attestation bundles were made for ricoeur-0.3.3.tar.gz:

Publisher: python-publish.yml on yanndebray/ricoeur

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ricoeur-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: ricoeur-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 26.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ricoeur-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2a7808d19f9541380dc1526b122bf1215ef1c0be48d47cf3f32e85be788390b9
MD5 92b8a3bc1a141691799ffe242d3ae498
BLAKE2b-256 e293e5a4f9a6b6aeeb6f028cdbc8d8aa84cf3624db94372c12b4c29dd9d0b3fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for ricoeur-0.3.3-py3-none-any.whl:

Publisher: python-publish.yml on yanndebray/ricoeur

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page