Skip to main content

LLM Context Compression and Retrieval Engine — zero dependencies, sub-100ms queries, 40-70% token reduction

Project description

Mnemosyne

Mnemosyne

Intelligent code retrieval engine — index, search, and compress any codebase with zero dependencies.

PyPI License Python Dependencies


Mnemosyne indexes your codebase into a local SQLite store, scores every chunk with a 6-signal hybrid retriever, compresses results with AST awareness, and returns exactly what you need within a token or result budget. It runs entirely locally — no API keys, no cloud, no runtime dependencies beyond Python 3.11+.

Install

pip install mnemosyne-engine

Quick Start

mnemosyne init                                    # create .mnemosyne/ workspace
mnemosyne ingest                                  # index your codebase
mnemosyne query "How does authentication work?"   # search

Performance

Metric Result
Query latency <20ms warm, <500ms cold
Token reduction 73% on 829-file production repo
File retrieval accuracy 100% across all test sets
Ingestion speed 167 files/sec (~0.5s for 87 files)
Compression 40-70% per chunk, AST-aware
Memory footprint 10-30 MB total
Storage overhead ~4.2 bytes per indexed token

Features

  • Hybrid 6-signal search — BM25, TF-IDF, symbol matching, usage frequency, predictive prefetch, and optional dense embeddings fused via Reciprocal Rank Fusion
  • Cost-model ranking — results ranked by value-per-token, not just relevance. Like a query optimizer for code retrieval
  • AST-aware compression — four-stage pipeline preserves signatures, docstrings, and control flow while collapsing boilerplate (20-60% reduction)
  • Self-tuning ARC cache — adapts between recency and frequency patterns automatically, persisted across sessions
  • Delta-aware tracking — detects file and chunk-level changes, delivers diffs instead of full content (80-95% savings on incremental queries)
  • Content deduplication — SHA-256 addressed storage eliminates duplicate chunks across files
  • 7-language structural chunking — Python (AST), JavaScript/TypeScript, Go, C#, Rust, Java, Kotlin, plus Markdown and plain text
  • Daemon mode — JSON-RPC over Unix socket keeps indexes warm for sub-20ms queries
  • Full audit trail — append-only JSON-lines log of every operation
  • Zero runtime dependencies — pure Python 3.11+ stdlib. One pip install, no conflicts

Use Cases

Code search and navigation — Natural language queries return ranked, deduplicated results with function-level precision. Symbol-aware search finds implementations directly, not just string matches.

LLM context optimization — Feed Claude, GPT, Cursor, or any LLM agent the right tokens from a 100K+ codebase. Drop-in integration via instruction files cuts API spend 70%+ on context-heavy workflows.

Developer onboarding — New team members query "how does X work?" and get ranked results spanning models, middleware, and routes — complete function signatures with context, not random line hits.

PR review and CI/CD — Delta tracking identifies which functions changed and pulls their callers and tests into a review bundle. Pipe query output into automated review pipelines.

Legacy codebase archaeology — Before a rewrite or migration, index a large monolith to answer "what calls this table?" or "which modules depend on this API?" Hybrid search beats grep for cross-cutting queries.

Security audit surface mapping — Query for patterns like exec(, eval(, subprocess.call with usage-frequency ranking to prioritize the most-called dangerous patterns. Audit log provides evidence trail for compliance.

Incident response — On-call engineer searches "payment timeout retry" at 3am. Gets ranked, compressed results across the codebase instead of grepping blindly.

Migration impact analysis — Planning a framework upgrade or library swap? Query every usage of the old API, ranked by call frequency, to estimate effort and prioritize high-traffic paths.

LLM Agent Integration

Add to your CLAUDE.md, .cursorrules, or equivalent instruction file:

Before answering questions about this codebase, run:
  ! mnemosyne query "<question>" --budget 8000
Use the returned chunks as primary context. Only read additional files if needed.

Works with Claude Code, Cursor, Aider, Copilot, and any agent that can run shell commands.

CLI

Command Purpose
init Create workspace and config
ingest Index files (incremental, --full to rebuild)
query Search with token budget
stats Index and cache statistics
compress Preview compression for a file
delta Show changes since last index
cache Manage ARC cache (show, clear, warm)
daemon Persistent server for warm-start queries
analytics Precision metrics and usage patterns
audit Operation log
health Index integrity checks
gc Garbage collect stale data
benchmark Run precision benchmarks

Documentation

Document Contents
REFERENCE.md Full CLI reference, configuration, architecture, integration guides
ALGORITHMS.md Algorithm details with academic paper references
TUNING.md Precision tuning guide
CHANGELOG.md Version history

License

Dual-licensed: AGPL-3.0 for open-source use | Commercial license for proprietary embedding.

Copyright 2026 Cast Rock Innovation L.L.C. (DBA: Cast Net Technology)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mnemosyne_engine-1.0.2.tar.gz (176.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mnemosyne_engine-1.0.2-py3-none-any.whl (207.5 kB view details)

Uploaded Python 3

File details

Details for the file mnemosyne_engine-1.0.2.tar.gz.

File metadata

  • Download URL: mnemosyne_engine-1.0.2.tar.gz
  • Upload date:
  • Size: 176.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for mnemosyne_engine-1.0.2.tar.gz
Algorithm Hash digest
SHA256 3605ab43a4e0039cf1d992de7727d7609d6e7b1356ffd50c6f1f48d34c4fec34
MD5 1d0483c48125c1d9087cfbd0ef4d122f
BLAKE2b-256 18edb73911163ba480abae244738f59698d06ecff627875281578b74c0586b8d

See more details on using hashes here.

File details

Details for the file mnemosyne_engine-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for mnemosyne_engine-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 128bd68dbdd374eda9fca2c402b025240d55b4428400b3d8ed145c018d904488
MD5 22c6eba0925c4b66998b9fc11a4ec198
BLAKE2b-256 11032788ba12e8efd9025fdb70a550d24e1e44baf6088200f309b886991ac24b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page