Skip to main content

Library-first tool for parsing AI conversation exports with search, filtering, and markdown export

Project description

Echomine

Library-first tool for parsing AI conversation exports with search, filtering, and markdown export

PyPI Downloads Python 3.12+ Type Checked Code Style: Ruff codecov Documentation

Overview

Echomine is a Python library and CLI tool for parsing, searching, and exporting AI conversation exports. Initially designed for ChatGPT exports, it uses a multi-provider adapter pattern to support future AI platforms (Claude, Gemini, etc.).

Key Features

  • Memory Efficient: Stream-based parsing handles 1GB+ files with constant memory usage
  • Advanced Search: BM25 relevance ranking with exact phrase matching, boolean logic, role filtering, and keyword exclusion
  • Message Snippets: Automatic preview generation for search results with match context
  • Statistics & Analytics: Calculate export statistics, conversation metrics, and temporal patterns
  • Rich CLI Output: Color-coded terminal formatting, tables, progress bars, and syntax highlighting
  • Multiple Export Formats: Export to Markdown (with YAML frontmatter), JSON, or CSV
  • Type Safe: Strict typing with Pydantic v2 and mypy --strict compliance
  • Library First: All CLI capabilities available as importable Python library
  • Multi-Provider Ready: Adapter pattern supports multiple AI export formats

Design Principles

  1. Library-First Architecture: CLI built on top of library, not vice versa
  2. Strict Type Safety: mypy --strict, no Any types in public API
  3. Memory Efficiency: Stream-based parsing, never load entire file into memory
  4. Test-Driven Development: All features test-first validated
  5. YAGNI: Simple solutions, no speculative features

See Constitution for complete design principles.

Installation

From Source

# Clone repository
git clone https://github.com/echomine/echomine.git
cd echomine

# Install with development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks (optional)
pre-commit install

From PyPI (when published)

pip install echomine

Quick Start

Library API (Primary Interface)

from echomine import OpenAIAdapter, SearchQuery
from pathlib import Path

# Initialize adapter (stateless, reusable)
adapter = OpenAIAdapter()
export_file = Path("conversations.json")

# 1. List all conversations (discovery)
for conversation in adapter.stream_conversations(export_file):
    print(f"[{conversation.created_at.date()}] {conversation.title}")
    print(f"  Messages: {len(conversation.messages)}")

# 2. Search with keywords (BM25 ranking)
query = SearchQuery(keywords=["algorithm", "design"], limit=10)
for result in adapter.search(export_file, query):
    print(f"{result.conversation.title} (score: {result.score:.2f})")
    print(f"  Preview: {result.snippet}")  # v1.1.0: automatic snippets

# 3. Advanced search with filters (v1.1.0+)
from datetime import date
query = SearchQuery(
    keywords=["refactor"],
    phrases=["algo-insights"],  # Exact phrase matching
    match_mode="all",  # Require ALL keywords (AND logic)
    exclude_keywords=["test"],  # Filter out unwanted results
    role_filter="user",  # Search only user messages
    from_date=date(2024, 1, 1),
    to_date=date(2024, 3, 31),
    limit=5
)
for result in adapter.search(export_file, query):
    print(f"[{result.score:.2f}] {result.conversation.title}")
    print(f"  Snippet: {result.snippet}")

# 4. Calculate statistics (v1.2.0+)
from echomine import calculate_statistics

stats = calculate_statistics(export_file)
print(f"Total conversations: {stats.total_conversations}")
print(f"Total messages: {stats.total_messages}")
print(f"Average messages: {stats.average_messages:.1f}")

# 5. Get specific conversation by ID
conversation = adapter.get_conversation_by_id(export_file, "conv-abc123")
if conversation:
    print(f"Found: {conversation.title}")

CLI Usage (Built on Library)

# List all conversations
echomine list export.json

# Search by keywords
echomine search export.json --keywords "algorithm,design" --limit 10

# Search by exact phrase (v1.1.0+)
echomine search export.json --phrase "algo-insights"

# Boolean match mode: require ALL keywords (v1.1.0+)
echomine search export.json -k "python" -k "async" --match-mode all

# Exclude unwanted results (v1.1.0+)
echomine search export.json -k "python" --exclude "django" --exclude "flask"

# Role filtering: search only user/assistant messages (v1.1.0+)
echomine search export.json -k "refactor" --role user

# Combine all filters (v1.1.0+)
echomine search export.json --phrase "api" -k "python" --exclude "test" --role user --match-mode all

# Search by title (fast, metadata-only)
echomine search export.json --title "Project"

# Filter by date range
echomine search export.json --from-date "2024-01-01" --to-date "2024-03-31"

# View export statistics (v1.2.0+)
echomine stats export.json

# Get conversation by ID (v1.2.0+)
echomine get export.json conv-abc123

# Export conversation to markdown with YAML frontmatter (v1.2.0+)
echomine export export.json conv-abc123 --output algo.md

# Export as JSON
echomine export export.json conv-abc123 --format json --output algo.json

# Export as CSV (v1.2.0+)
echomine export export.json conv-abc123 --format csv --output algo.csv

# JSON output for search results
echomine search export.json --keywords "python" --json | jq '.results[].title'

# Version info
echomine --version

Search Filter Logic: Content matching (phrases OR keywords) happens first, then post-filtering (--exclude, --role, --title, dates) is applied. See CLI Usage for details.

See Quickstart Guide for detailed examples.

Development

Prerequisites

  • Python 3.12 or higher
  • Git

Setup Development Environment

# Clone repository
git clone https://github.com/echomine/echomine.git
cd echomine

# Install with development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=echomine --cov-report=html

# Run specific test categories
pytest -m unit           # Unit tests only
pytest -m integration    # Integration tests only
pytest -m contract       # Contract tests only
pytest -m performance    # Performance benchmarks

Code Quality

# Type checking (strict mode)
mypy src/

# Linting and formatting
ruff check .
ruff format .

# Run pre-commit hooks manually
pre-commit run --all-files

Project Structure

echomine/
├── src/echomine/           # Library source code
│   ├── models/             # Pydantic data models
│   ├── adapters/           # Provider adapters (OpenAI, etc.)
│   ├── parsers/            # Streaming JSON parsers
│   ├── search/             # Search and ranking logic
│   ├── exporters/          # Export formatters (markdown, JSON)
│   └── cli/                # CLI commands
├── tests/                  # Test suite
│   ├── unit/               # Unit tests
│   ├── integration/        # Integration tests
│   ├── contract/           # Protocol contract tests
│   └── performance/        # Performance benchmarks
└── specs/                  # Design documents
    └── 001-ai-chat-parser/ # Feature specification

Documentation

Full Documentation - Comprehensive guides, API reference, and examples

Quick Links

Spec Documents

Performance

Echomine is designed for memory efficiency and speed:

  • Memory: O(1) memory usage regardless of file size (streaming-based)
  • Search: <30 seconds for 1.6GB files (10K conversations, 50K messages)
  • Listing: <5 seconds for 10K conversations

See Performance Requirements for benchmarks.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for:

  • Development setup and prerequisites
  • TDD workflow (RED-GREEN-REFACTOR cycle mandatory)
  • Testing guidelines (pytest, mypy --strict, ruff)
  • Code quality standards and conventions
  • Commit message format (conventional commits)
  • Pull request process

License

AGPL-3.0 License - See LICENSE file for details

Acknowledgments

Built with:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

echomine-1.2.0.tar.gz (94.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

echomine-1.2.0-py3-none-any.whl (106.2 kB view details)

Uploaded Python 3

File details

Details for the file echomine-1.2.0.tar.gz.

File metadata

  • Download URL: echomine-1.2.0.tar.gz
  • Upload date:
  • Size: 94.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for echomine-1.2.0.tar.gz
Algorithm Hash digest
SHA256 5bb968b061da8b5a3078975e3ae392e8a15ec8faef79108f8b02de039daafca0
MD5 153ab4c49f5306c3411e68d14bd77856
BLAKE2b-256 4020c5c191a001c2e768a5e996c948d5f585fbdb442075012e06ee057113f45b

See more details on using hashes here.

Provenance

The following attestation bundles were made for echomine-1.2.0.tar.gz:

Publisher: release.yml on aucontraire/echomine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file echomine-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: echomine-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 106.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for echomine-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5b6f582939ef76304c97eaeb74c153bbff1546090d8acb3075e4a4acc6a0e040
MD5 3236046dcf7b997bf28836b48dd25842
BLAKE2b-256 7ea37f3e43494a27f0273fec61657b497c593c8f95e9d5753ea6156fdee3ef17

See more details on using hashes here.

Provenance

The following attestation bundles were made for echomine-1.2.0-py3-none-any.whl:

Publisher: release.yml on aucontraire/echomine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page