High-performance document memory system with vector search capabilities for Python applications

These details have not been verified by PyPI

Project links

Project description

llmemory

A high-performance document memory system with vector search capabilities for Python applications.

Overview

llmemory provides intelligent document processing with:

Complete Document Management API – List, retrieve, search, and manage documents without direct database access
State-of-the-art retrieval using PostgreSQL with pgvector, hybrid BM25, multi-query expansion, and reranking
Multi-language support with automatic language detection and normalization
Hierarchical chunking & summaries with document-type specific configurations and optional auto-summaries
Production-ready monitoring with Prometheus metrics and searchable diagnostics

What's New

🔁 Multi-query expansion – Generate semantic + keyword variants automatically and fuse results with reciprocal rank fusion
🎯 Configurable reranking – Plug in OpenAI or cross-encoder rerankers (or use built-in heuristics) for higher precision on the final hit list
🧭 Query routing – Automatic answerability detection routes queries to best retrieval strategy
🎨 Contextual retrieval – Anthropic-style chunk enrichment with document context for improved semantic matching
⚙️ HNSW presets – Choose fast, balanced, or accurate profiles to tune pgvector index parameters and query-time ef_search
📝 Chunk summaries – Capture short, metadata-aware synopses during ingestion and surface them with every search hit
📈 Richer diagnostics – Search history now records query variants, latency breakdowns, rerank status, and summary usage for easy tuning

Why llmemory?

Building applications with document search capabilities requires solving complex problems:

Vector embeddings for semantic understanding
Efficient chunking that preserves context
Hybrid search combining vectors and full-text
Multi-tenant isolation for SaaS applications
Performance optimization for large document sets

llmemory provides a production-ready solution for these challenges.

Key Features

🚀 Fast Search – HNSW indexes for sub-100 ms vector searches, with multi-query expansion and optional cross-encoder reranking for harder queries
🌍 Multi-language – Automatic detection and processing for 14+ languages
📊 Smart Chunking – Document-type aware chunking with contextual enrichment, optional inline summaries, and hierarchical parent context
🔍 Hybrid Search – Combines vector and text search with reciprocal rank fusion, query routing, and rerank scores
📈 Observable – Built-in Prometheus metrics and detailed search diagnostics
🏢 Multi-tenant – Owner-based isolation for SaaS applications
🔌 Flexible Embeddings – Support for OpenAI and local embedding models

Quick Start

from llmemory import LLMemory, DocumentType, SearchType

# Initialize
memory = LLMemory(
    connection_string="postgresql://localhost/mydb",
    openai_api_key="sk-..."
)
await memory.initialize()

# Add a document - returns detailed results
result = await memory.add_document(
    owner_id="workspace-123",
    id_at_origin="user-456",
    document_name="project-report.pdf",
    document_type=DocumentType.REPORT,
    content="Your document content here..."
)
print(f"Created {result.chunks_created} chunks in {result.processing_time_ms}ms")

# List documents with filtering
docs = await memory.list_documents(
    owner_id="workspace-123",
    document_type=DocumentType.REPORT,
    metadata_filter={"status": "active"}
)

# Search with document metadata
results = await memory.search_with_documents(
    owner_id="workspace-123",
    query_text="project timeline",
    search_type=SearchType.HYBRID
)
for result in results.results:
    print(f"Found in: {result.document_name} - {result.content[:100]}...")

# Get statistics
stats = await memory.get_statistics("workspace-123")
print(f"Total: {stats.document_count} docs, {stats.chunk_count} chunks")

Installation

# Install using uv (recommended)
uv add llmemory

# Or using pip
pip install llmemory

Or with optional dependencies:

# Using uv
uv add "llmemory[monitoring]"  # For Prometheus metrics
uv add "llmemory[cache]"       # For Redis caching
uv add "llmemory[local]"       # For local embeddings
uv add "llmemory[reranker-local]"  # For local cross-encoder reranking
uv add "llmemory[bench]"       # For BEIR benchmarking harness

# Using pip
pip install "llmemory[monitoring]"  # For Prometheus metrics
pip install "llmemory[cache]"       # For Redis caching
pip install "llmemory[local]"       # For local embeddings
pip install "llmemory[reranker-local]"  # For cross-encoder reranking support
pip install "llmemory[bench]"       # For benchmarking harness

Documentation

📖 Installation Guide - Detailed setup instructions
🚀 Quick Start - Get running in 5 minutes
🎯 Usage Patterns - Standalone vs shared pool patterns
📚 API Reference - Complete API documentation
🔧 Integration Guide - Framework integration patterns
🗄️ Migration Guide - How migrations work in each pattern
📊 Monitoring Guide - Production monitoring setup
💡 Examples - Working examples for common use cases
🧪 bench/beir_runner.py - BEIR benchmarking harness (requires llmemory[bench])

Performance

Search latency: < 100ms (p95) with proper indexing
Throughput: 1000+ searches/second with caching
Document processing: Handles documents up to 1MB efficiently
Multi-language: Processes 14+ languages with automatic detection

Requirements

PostgreSQL 14+ with pgvector extension
Python 3.10+
OpenAI API key (or local embedding models)

License

MIT License - see LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.0

Jan 15, 2026

This version

0.4.0

Oct 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmemory-0.4.0.tar.gz (230.6 kB view details)

Uploaded Oct 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llmemory-0.4.0-py3-none-any.whl (83.5 kB view details)

Uploaded Oct 26, 2025 Python 3

File details

Details for the file llmemory-0.4.0.tar.gz.

File metadata

Download URL: llmemory-0.4.0.tar.gz
Upload date: Oct 26, 2025
Size: 230.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.4

File hashes

Hashes for llmemory-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`647f91d698fc21cc4f79d7b0646e91c35a8094948860ee8dc60288d80c30a100`
MD5	`51201a9ebb76d9be5a3b725b9d509309`
BLAKE2b-256	`38a7f0e84f3d358dd2e3da967095d75e80b081ae98ff4bc1f440df84af103f56`

See more details on using hashes here.

File details

Details for the file llmemory-0.4.0-py3-none-any.whl.

File metadata

Download URL: llmemory-0.4.0-py3-none-any.whl
Upload date: Oct 26, 2025
Size: 83.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.4

File hashes

Hashes for llmemory-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`642d7017c4c4c29c0caa5b1b6a8c82030764c57dbff662a16cda18815ffecdc5`
MD5	`4c14722c878474bf7cfa60d44b3e2765`
BLAKE2b-256	`b5cdcbeafd4b6b9d8b9263de211794c7adb958377be41efa396d03b5b8f1e6ce`

See more details on using hashes here.

llmemory 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

llmemory

Overview

What's New

Why llmemory?

Key Features

Quick Start

Installation

Documentation

Performance

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes