Skip to main content

Markdown-first memory infrastructure for AI agents with hybrid search

Project description

memtomem

🚧 Alpha — APIs, defaults, and on-disk config surfaces may still change between 0.1.x releases. Feedback and issue reports are especially welcome at github.com/memtomem/memtomem/issues.

Markdown-first long-term memory infrastructure for AI agents. Hybrid keyword + semantic search across your notes, docs, and code via the Model Context Protocol.

Core philosophy: .md files are the source of truth and the vector database is a derived cache. Manage memories as plain text files — memtomem makes them instantly searchable.

Built for:

  • AI agents (Claude Code, Cursor, Windsurf, Claude Desktop) that need to remember between sessions
  • Developers who want a searchable knowledge base built from their existing markdown notes — no proprietary database, no vendor lock-in
  • Multilingual content (English, Korean, Japanese, Chinese) via bge-m3 embeddings

Quick Start

# 1. Install memtomem (requires Python 3.12+)
uv tool install memtomem        # or: pipx install memtomem

# 2. Run the setup (preset picker → memory_dir + MCP)
mm init    # on PATH after `uv tool install` — no `uv run` needed

The picker offers three presets and an Advanced fallback:

Preset Embedding Reranker Tokenizer
Minimal BM25 only (no download) unicode61
English (Recommended) ONNX bge-small-en-v1.5 (~33 MB, 384d) English (ms-marco-MiniLM-L-6-v2) unicode61
Korean-optimized ONNX bge-m3 (~1.2 GB, 1024d) Multilingual (jina-reranker-v2) kiwipiepy
Advanced — (full 10-step wizard, all options)

Pick a preset interactively, or use mm init -y (minimal), mm init --preset korean -y, or mm init --advanced for scripted runs. See Embeddings for the full model matrix.

Then in your AI editor, ask:

"Call the mem_status tool"   →  confirms the server is connected
"Index my notes folder"      →  mem_index(path="~/notes")
"Search for deployment"      →  mem_search(query="deployment checklist")
"Remember this insight"      →  mem_add(content="...", tags=["ops"])

That's it. Your agent now has a long-term memory built from plain markdown files.

For full setup, OpenAI configuration, and troubleshooting, see the Getting Started guide.

Prefer no install? (uvx direct, MCP only)

If you'd rather skip the CLI install, uvx will download and run memtomem on demand. ~/.memtomem/memories is always indexed; for AI tool memory folders (Claude Code per-project memory, Claude plans, Codex memories), run mm init once and pick the surfaces you want indexed — nothing is added silently. Set MEMTOMEM_INDEXING__MEMORY_DIRS to add custom paths.

claude mcp add memtomem -s user -- uvx --from memtomem memtomem-server

Or add the following to your MCP client config file — the path depends on the editor: ~/.cursor/mcp.json (Cursor), ~/.codeium/windsurf/mcp_config.json (Windsurf), ~/Library/Application Support/Claude/claude_desktop_config.json (Claude Desktop), or ~/.gemini/settings.json (Gemini CLI):

{
  "mcpServers": {
    "memtomem": {
      "command": "uvx",
      "args": ["--from", "memtomem", "memtomem-server"],
      "env": {
        "MEMTOMEM_INDEXING__MEMORY_DIRS": "[\"/path/to/your/notes\"]"
      }
    }
  }
}

Key Features

  • 🔍 Hybrid search — BM25 (FTS5) + dense vectors (sqlite-vec) merged via Reciprocal Rank Fusion. Exact terms via keyword, meaning via semantic, both at once.
  • 📦 Semantic chunking — heading-aware Markdown, AST-based Python, tree-sitter JS/TS, structure-aware JSON/YAML/TOML
  • ♻️ Incremental indexing — chunk-level SHA-256 diff means only changed chunks get re-embedded
  • 🏷️ Namespaces — scope memories into groups (work / personal / project) with optional auto-derivation from folder names
  • 🧹 Maintenance — near-duplicate detection with merge, time-based score decay, TTL expiration, auto-tagging
  • 🔄 Export / import — JSON bundle backup and restore with re-embedding
  • 🌐 Web UI — polished SPA dashboard for search, sources, indexing, tags, and timeline (mm web --dev unlocks the full maintainer surface including Sessions, Working Memory, and Health Report)
  • 🛠️ 74 MCP tools — full feature surface as MCP tools, with mem_do meta-tool routing all registered actions in core mode (default) for minimal context usage

Documentation

Full documentation lives in the memtomem GitHub repo:

Guide Topic
Getting Started Start here — install, setup wizard, first use
Reference Complete feature reference — all tools and patterns
Configuration All MEMTOMEM_* environment variables
Embeddings ONNX, Ollama, and OpenAI providers, model dimensions, switching models
MCP Client Setup Editor-specific configuration
memtomem-stm Optional STM proxy for proactive memory surfacing (separate package)

License

Apache License 2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memtomem-0.1.20.tar.gz (414.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memtomem-0.1.20-py3-none-any.whl (526.0 kB view details)

Uploaded Python 3

File details

Details for the file memtomem-0.1.20.tar.gz.

File metadata

  • Download URL: memtomem-0.1.20.tar.gz
  • Upload date:
  • Size: 414.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for memtomem-0.1.20.tar.gz
Algorithm Hash digest
SHA256 05a5a2ce976391ef72fa886fccc992e95a7ef7622dee621258c14e1ffa1fbfa5
MD5 70e95f13c34976b951d706dd8dab217d
BLAKE2b-256 6f43a6630795bca3c002437d45c1fcdfd130112a53fd257e49430fed36f0213d

See more details on using hashes here.

Provenance

The following attestation bundles were made for memtomem-0.1.20.tar.gz:

Publisher: release.yml on memtomem/memtomem

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file memtomem-0.1.20-py3-none-any.whl.

File metadata

  • Download URL: memtomem-0.1.20-py3-none-any.whl
  • Upload date:
  • Size: 526.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for memtomem-0.1.20-py3-none-any.whl
Algorithm Hash digest
SHA256 5701f0f456d2141f190ffa12b08c70ead58dfc497981fa69a5eee33584de1406
MD5 6d3eca5d7b4706ba6d7502e5d53c1efa
BLAKE2b-256 b3079226adfa35b2c81ddb1a986b570a319cbb80d07ecc0733836f30d23eff5a

See more details on using hashes here.

Provenance

The following attestation bundles were made for memtomem-0.1.20-py3-none-any.whl:

Publisher: release.yml on memtomem/memtomem

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page