Skip to main content

Library-first tool for parsing AI conversation exports with search, filtering, and markdown export

Project description

Echomine

Library-first tool for parsing AI conversation exports with search, filtering, and markdown export

PyPI Downloads Python 3.12+ Type Checked Code Style: Ruff codecov Documentation

Overview

Echomine is a Python library and CLI tool for parsing, searching, and exporting AI conversation exports. Initially designed for ChatGPT exports, it uses a multi-provider adapter pattern to support future AI platforms (Claude, Gemini, etc.).

Key Features

  • Memory Efficient: Stream-based parsing handles 1GB+ files with constant memory usage
  • Advanced Search: BM25 relevance ranking with exact phrase matching, boolean logic, role filtering, and keyword exclusion
  • Message Snippets: Automatic preview generation for search results with match context
  • Type Safe: Strict typing with Pydantic v2 and mypy --strict compliance
  • Library First: All CLI capabilities available as importable Python library
  • Multi-Provider Ready: Adapter pattern supports multiple AI export formats

Design Principles

  1. Library-First Architecture: CLI built on top of library, not vice versa
  2. Strict Type Safety: mypy --strict, no Any types in public API
  3. Memory Efficiency: Stream-based parsing, never load entire file into memory
  4. Test-Driven Development: All features test-first validated
  5. YAGNI: Simple solutions, no speculative features

See Constitution for complete design principles.

Installation

From Source

# Clone repository
git clone https://github.com/echomine/echomine.git
cd echomine

# Install with development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks (optional)
pre-commit install

From PyPI (when published)

pip install echomine

Quick Start

Library API (Primary Interface)

from echomine import OpenAIAdapter, SearchQuery
from pathlib import Path

# Initialize adapter (stateless, reusable)
adapter = OpenAIAdapter()
export_file = Path("conversations.json")

# 1. List all conversations (discovery)
for conversation in adapter.stream_conversations(export_file):
    print(f"[{conversation.created_at.date()}] {conversation.title}")
    print(f"  Messages: {len(conversation.messages)}")

# 2. Search with keywords (BM25 ranking)
query = SearchQuery(keywords=["algorithm", "design"], limit=10)
for result in adapter.search(export_file, query):
    print(f"{result.conversation.title} (score: {result.score:.2f})")
    print(f"  Preview: {result.snippet}")  # v1.1.0: automatic snippets

# 3. Advanced search with filters (v1.1.0+)
from datetime import date
query = SearchQuery(
    keywords=["refactor"],
    phrases=["algo-insights"],  # Exact phrase matching
    match_mode="all",  # Require ALL keywords (AND logic)
    exclude_keywords=["test"],  # Filter out unwanted results
    role_filter="user",  # Search only user messages
    from_date=date(2024, 1, 1),
    to_date=date(2024, 3, 31),
    limit=5
)
for result in adapter.search(export_file, query):
    print(f"[{result.score:.2f}] {result.conversation.title}")
    print(f"  Snippet: {result.snippet}")

# 4. Get specific conversation by ID
conversation = adapter.get_conversation_by_id(export_file, "conv-abc123")
if conversation:
    print(f"Found: {conversation.title}")

CLI Usage (Built on Library)

# List all conversations
echomine list export.json

# Search by keywords
echomine search export.json --keywords "algorithm,design" --limit 10

# Search by exact phrase (v1.1.0+)
echomine search export.json --phrase "algo-insights"

# Boolean match mode: require ALL keywords (v1.1.0+)
echomine search export.json -k "python" -k "async" --match-mode all

# Exclude unwanted results (v1.1.0+)
echomine search export.json -k "python" --exclude "django" --exclude "flask"

# Role filtering: search only user/assistant messages (v1.1.0+)
echomine search export.json -k "refactor" --role user

# Combine all filters (v1.1.0+)
echomine search export.json --phrase "api" -k "python" --exclude "test" --role user --match-mode all

# Search by title (fast, metadata-only)
echomine search export.json --title "Project"

# Filter by date range
echomine search export.json --from-date "2024-01-01" --to-date "2024-03-31"

# Get conversation by ID
echomine get conversation export.json conv-abc123

# Get message by ID (with conversation hint for performance)
echomine get message export.json msg-def456 -c conv-abc123

# Export conversation to markdown (default)
echomine export export.json conv-abc123 --output algo.md

# Export as JSON
echomine export export.json conv-abc123 --format json --output algo.json

# JSON to stdout for piping
echomine export export.json conv-abc123 -f json | jq '.messages | length'

# JSON output for search results
echomine search export.json --keywords "python" --json | jq '.results[].title'

# Version info
echomine --version

Search Filter Logic: Content matching (phrases OR keywords) happens first, then post-filtering (--exclude, --role, --title, dates) is applied. See CLI Usage for details.

See Quickstart Guide for detailed examples.

Development

Prerequisites

  • Python 3.12 or higher
  • Git

Setup Development Environment

# Clone repository
git clone https://github.com/echomine/echomine.git
cd echomine

# Install with development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=echomine --cov-report=html

# Run specific test categories
pytest -m unit           # Unit tests only
pytest -m integration    # Integration tests only
pytest -m contract       # Contract tests only
pytest -m performance    # Performance benchmarks

Code Quality

# Type checking (strict mode)
mypy src/

# Linting and formatting
ruff check .
ruff format .

# Run pre-commit hooks manually
pre-commit run --all-files

Project Structure

echomine/
├── src/echomine/           # Library source code
│   ├── models/             # Pydantic data models
│   ├── adapters/           # Provider adapters (OpenAI, etc.)
│   ├── parsers/            # Streaming JSON parsers
│   ├── search/             # Search and ranking logic
│   ├── exporters/          # Export formatters (markdown, JSON)
│   └── cli/                # CLI commands
├── tests/                  # Test suite
│   ├── unit/               # Unit tests
│   ├── integration/        # Integration tests
│   ├── contract/           # Protocol contract tests
│   └── performance/        # Performance benchmarks
└── specs/                  # Design documents
    └── 001-ai-chat-parser/ # Feature specification

Documentation

Full Documentation - Comprehensive guides, API reference, and examples

Quick Links

Spec Documents

Performance

Echomine is designed for memory efficiency and speed:

  • Memory: O(1) memory usage regardless of file size (streaming-based)
  • Search: <30 seconds for 1.6GB files (10K conversations, 50K messages)
  • Listing: <5 seconds for 10K conversations

See Performance Requirements for benchmarks.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for:

  • Development setup and prerequisites
  • TDD workflow (RED-GREEN-REFACTOR cycle mandatory)
  • Testing guidelines (pytest, mypy --strict, ruff)
  • Code quality standards and conventions
  • Commit message format (conventional commits)
  • Pull request process

License

AGPL-3.0 License - See LICENSE file for details

Acknowledgments

Built with:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

echomine-1.1.0.tar.gz (75.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

echomine-1.1.0-py3-none-any.whl (83.9 kB view details)

Uploaded Python 3

File details

Details for the file echomine-1.1.0.tar.gz.

File metadata

  • Download URL: echomine-1.1.0.tar.gz
  • Upload date:
  • Size: 75.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for echomine-1.1.0.tar.gz
Algorithm Hash digest
SHA256 de38f5ac3019368a53070f86dfa72fdf11434d00bdf7fefc739e596c9ec4992b
MD5 19f01e7498cf12a3bc31b45daf86bdeb
BLAKE2b-256 6d05754c069d3bb9b865f5abe49648068b017f0453842bf0ff0999fe73c2fbb4

See more details on using hashes here.

Provenance

The following attestation bundles were made for echomine-1.1.0.tar.gz:

Publisher: release.yml on aucontraire/echomine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file echomine-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: echomine-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 83.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for echomine-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 525387003d70452fe870f78162e809889ddc7cefa9aef7cc41d0ef71f6099a60
MD5 60ffe0c84f12f85ab98a89c1c2107353
BLAKE2b-256 9cb0af1240f1c3076525f7e5dec251d49592035f7d595735161fecc374a0e337

See more details on using hashes here.

Provenance

The following attestation bundles were made for echomine-1.1.0-py3-none-any.whl:

Publisher: release.yml on aucontraire/echomine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page