Skip to main content

Local, token-free file digestion → knowledge-graph memory for Claude. Convert any attachment to Markdown and digest it into a graph + exportable memory, entirely on-device. Tuned for Apple silicon.

Project description

Memorised them All — local, token-free file-to-knowledge-graph memory for Claude (MCP server)

Memorised them All

Local, token-free document memory for Claude — turn any folder of files into a knowledge graph you can recall, for ~0 context tokens.

An MCP server & plugin for Claude Desktop and Claude Code that converts PDFs, Office docs, images, and audio to Markdown on your machine, then digests them into a searchable knowledge graph + mind map — privately, with no cloud and no API keys.

CI Release PyPI Downloads Python License: MIT Platforms Token cost

Quickstart · Why token-free · Features · How it works · Tools · Use cases · Comparison · Config · Platforms · Privacy · FAQ

100% local · free & open-source · auto-installing · Apple-silicon optimised · by GRU-953


The idea in one line: Claude tokens are expensive; your computer's compute is free. So every heavy step — converting documents, extracting knowledge, embedding, summarising — runs locally, and Claude only ever gets back a tiny tool result. Digesting a 500-page folder costs roughly zero context tokens.

Point Memorised them All at a folder. It converts every attachment to Markdown locally, then builds a layered knowledge graph — a global synopsis, per-theme summaries, per-document notes, an exportable Markdown bundle, and an offline interactive mind map — and lets Claude recall from it for next to nothing.

"Memorise everything in ~/Documents/research."
"What did my documents say about the Q3 budget?"
"Open the mind map."

🚀 Quickstart

Claude DesktopClaude Code

Download memorised-them-all.mcpb from the latest release and double-click it (Settings → Extensions). It bootstraps the local stack on first launch.

/plugin marketplace add GRU-953/memorised-them-all
/plugin install memorised-them-all
pip (any OS)Homebrew (macOS/Linux)
pip install memorised-them-all
mta status
mta digest ~/Documents/research
brew install GRU-953/memorised-them-all/mta

Requirements: Python ≥ 3.10. The installer fetches everything else (Ollama, Tesseract, ffmpeg, the latest MarkItDown, and local models) — see Platform support.

💡 Why token-free

Most "chat with your documents" tools stream document text into the model — you pay tokens to ingest and to recall. Memorised them All never does that:

Step Where it runs Tokens to Claude
Convert PDF / Office / image / audio → Markdown your machine (MarkItDown · Tesseract · Whisper · Ollama) 0
Extract entities · relations · facts your machine (local LLM, classical fallback) 0
Embed · resolve · build graph · summarise your machine (Ollama · NetworkX) 0
digest result counts & paths only (~140 tokens)
recall result a small, citable slice — never the documents

Tool results are hard-capped in size, so the guarantee holds even on the high-accuracy path.

✨ Features

  • 📄 Universal local conversion — PDF, Word, Excel, PowerPoint, HTML, EPub, Outlook .msg, CSV/JSON/XML, images (OCR + vision captioning), and audio (on-device transcription) → clean Markdown. Scanned PDFs are OCR'd; OCR in 100+ languages when the matching Tesseract language packs are installed (e.g. eng+ben).
  • 🕸️ Layered knowledge graph (GraphRAG-style) — entities, typed relations and atomic facts, grouped into themes by community detection, with a global synopsis and per-theme summaries — all built by local models.
  • 🧭 Offline interactive mind map — a single self-contained mindmap.html (Cytoscape inlined, zero network).
  • 📝 Exportable, portable memorygraph.json, memory.md, and per-document notes you can copy to any machine and reuse.
  • Two modes — high-accuracy (local LLM) and fast mode (--fast): deterministic and much faster — it skips the per-chunk LLM, so the factor scales with corpus size and model (benchmarked ≈25–100× with the default 7B extractor: ≈98× on a 5-file set, ≈26× on 12 files) — ideal for large or frequently-refreshed corpora.
  • 🔁 Reusable named projects — accumulate many folders into one memory; forget to delete one.
  • 🍎 Apple-silicon first — performance-core parallelism, GPU Whisper via MLX, unified-memory-aware concurrency. Runs on Intel macOS, Linux, and Windows too.
  • ⚙️ Auto-installing & auto-updating — installs a pinned MarkItDown from PyPI so the first run works offline, and refreshes it on a throttled daily check (import-checked, with rollback); the latest upstream MarkItDown is opt-in (MTA_MARKITDOWN_UPSTREAM=on, pinned to a commit). Starts the model server on demand and stops it after 5 minutes idle.
  • 🌍 Multilingual — Unicode-aware entity resolution (Bengali, CJK, Cyrillic, accented Latin) and OCR in many languages.
  • 🛟 Crash-safe & reusable — memory is written atomically, so an interrupted digest never corrupts an existing project; recall reports a low_confidence signal so Claude can decline when the answer isn't in your docs.
  • 🔒 Private by design — no cloud, no API keys, no telemetry. Your files never leave your computer.

🧠 How it works

attachments ─► CONVERT ─► SEGMENT ─► EMBED ─► EXTRACT ─► RESOLVE ─► GRAPH + COMMUNITIES ─► MATERIALISE
  pdf/docx/   MarkItDown  structure  nomic-   local LLM  canonical  NetworkX +             graph.json
  xlsx/img/   +Tesseract  +semantic  embed-   triples +  entities   Leiden / Louvain        memory.md
  audio/...   +Whisper    chunking   text     facts      (embed +   community summaries     memory/<doc>.md
              +Ollama                          (+class-   fuzzy +    (local LLM)             mindmap.html
               vision                           ical      acronym)                           vectors store
                                                fallback)
                                                                              │
   recall("…")  ◄── embed query locally ──  return ONLY a small, citable slice (themes + facts + provenance)

Everything between attachments and recall happens on your machine. The local-LLM step has a dependency-free classical fallback, so a digest always succeeds — even offline, even before any model is downloaded — and gets sharper once models are present.

🛠 Tools

Eight token-free MCP tools (plus the mta CLI). Every result is metadata or a small slice — document contents never return to the conversation.

Tool What it does
digest(paths, project?, reset?, fast?) convert + digest files/dirs/globs; accumulates into the project (reset starts fresh, fast skips the LLM)
recall(query, project?, k?) answer from memory — a small, citable slice (+ top_score & low_confidence relevance signal)
memory_overview(project?) synopsis + themes
export_memory(dest, project?) export portable Markdown + graph + mind map
list_digestible(directory) list convertible files (paths/sizes only)
memory_status() local stack health (Ollama, models, Tesseract, MarkItDown version)
open_mindmap(project?) path to the offline interactive mind map
forget(project?) delete a project's memory

CLI: mta digest <paths> [--fast] [--reset] · mta recall "<q>" · mta overview · mta export <dir> · mta mindmap --open · mta forget · mta status · mta update · mta doctor. In Claude Code, the slash commands /memorise, /recall, /memory-map, /memory-status, /export-memory are also available.

🎯 Use cases

  • Private RAG / "chat with your documents" — locally, with no cloud and no per-query token cost.
  • Research & literature memory — digest a folder of papers/PDFs and ask grounded questions.
  • Knowledge base for an agent — give Claude durable, reusable memory across sessions.
  • Meeting & audio notes — transcribe recordings on-device and fold them into the graph.
  • Scanned documents & receipts — OCR image-only PDFs and images into searchable memory.
  • Visual exploration — open the offline mind map to see how entities and themes connect.

⚖️ Comparison

Memorised them All Cloud "chat with docs" / hosted RAG Stock markitdown-mcp
Runs fully locally ✅ (conversion only)
Context-token cost to ingest and recall ~0 high high (returns text)
Knowledge graph + themes sometimes
Offline interactive mind map
Works offline / no API keys
Reusable, exportable memory files varies
Free & open-source (MIT) varies

⚙️ Configuration

All optional, sensible defaults; set via environment (CLI) or the extension settings (Desktop).

Variable Default Meaning
MTA_HOME ~/.memorised-them-all where memories are stored
MTA_FAST off fast mode — skip the LLM (deterministic; benchmarked ≈25–100× faster, scales with corpus/model)
MTA_EXTRACT_MODEL qwen2.5:7b local LLM for extraction & summaries
MTA_EMBED_MODEL nomic-embed-text local embedding model
MTA_VISION_MODEL moondream image captioning
MTA_OCR_LANG eng Tesseract languages, e.g. eng+ben
MTA_WHISPER_MODEL base on-device transcription model
MTA_IDLE 300 seconds idle before Ollama is stopped
MTA_WORKERS / MTA_EXTRACT_WORKERS 0 (auto) parallel conversion / extraction workers
MTA_MAX_CHUNKS / MTA_MAX_FILE_MB 1500 / 200 workload & input-size caps (reported)
MTA_RECALL_MIN_SCORE 0 (off) drop recall hits below this cosine score (stricter grounding)
MTA_AUTO_UPDATE on daily update check: on (PyPI, default) · off · upstream (also pull the pinned upstream MarkItDown)
MTA_MARKITDOWN_UPSTREAM off pull the latest upstream MarkItDown commit (pinned to a SHA) instead of the PyPI build
MTA_NO_OLLAMA unset hard offline switch (classical + hashing)
MTA_PROFILE unset tuning profile: laptop · workstation · server · offline (an explicit MTA_* variable always wins)

💻 Platform support

Apple M-series is the primary, most-optimised target; other platforms use portable fallbacks.

Platform Status Notes
macOS (Apple silicon) ✅ optimised performance-core pool, MLX GPU Whisper, unified-memory-aware
macOS (Intel) ✅ supported physical-core sizing via psutil, CPU Whisper
Linux ✅ supported apt/dnf/pacman install paths, CUDA Whisper if a GPU is present
Windows 🧪 experimental pip install memorised-them-all + mta serve (or python launch.py). The .mcpb bundle is macOS/Linux only.

CI runs the test suite across Ubuntu, macOS, and Windows on Python 3.10 & 3.12.

📦 Generated files & reuse

Each project under MTA_HOME/projects/<name>/ is self-contained and portable:

File What it is
graph.json source of truth — nodes, edges, communities, layered summaries (version-stamped, no absolute paths)
memory.md compact, layered digest for reading / pasting
memory/<doc>.md one note per source document
mindmap.html offline interactive graph (Cytoscape inlined)
vectors.npz + vectors.json local embeddings for recall

A memory built once can be copied to another machine and reused read-only. export_memory bundles all of the above (including the vector store) into a folder you choose.

Versioning & migration: graph.json is a versioned schema (the project follows SemVer and Keep a Changelog). On upgrade, an older store is migrated in place so existing memories stay recall-readable; a store written by a newer build is backed up under projects/<name>/backups/ before anything overwrites it, so a downgrade never loses data. Public CLI flags and MCP tool signatures are preserved across minor versions.

🔒 Privacy & security

100% local — no cloud APIs, no telemetry, no API keys. Your documents, the graph, the embeddings, and the memory files never leave your machine. The only network access is (a) downloading open-source dependencies/models on install and (b) a throttled once-a-day dependency-update check (disable with MTA_AUTO_UPDATE=off).

Hardened for processing untrusted files: argv-only subprocesses (no curl | sh), path-safe and collision-free output names, per-file size + decompression-bomb caps, prompt-injection data-delimiting, allow_pickle=False, and hard-capped recall results. See the full threat model in the docs.

❓ FAQ

Does it really cost no tokens? Conversion and digestion cost zero Claude tokens. recall returns a small, hard-capped slice (a few summaries/facts), so answers are far cheaper than pasting documents into chat.

Do I need a GPU or downloaded models? No. The classical extractor and hashing embeddings keep the pipeline working with no models and offline; quality improves once Ollama and the models are present.

Where are my files? Under ~/.memorised-them-all/projects/<project>/. export_memory copies them anywhere.

Is my existing Ollama affected? No — a running Ollama is reused and left alone. Only an instance this tool starts is stopped on idle.

What's "fast mode"? --fast (or MTA_FAST=on) skips the local LLM for a fully deterministic, much faster digest (benchmarked ≈25–100× — ≈98× on a 5-file set, ≈26× on 12 files; scales with corpus and model) that still builds the graph and keeps semantic recall — ideal for large or frequently-updated corpora.

What if the answer isn't in my documents? Each recall result includes top_score and a low_confidence flag (and you can set MTA_RECALL_MIN_SCORE to drop weak hits), so Claude can say "that's not in your memory" instead of inventing an answer.

Does it work with non-English documents? Yes — entity resolution is Unicode-aware (Bengali, CJK, Cyrillic, accented Latin) and OCR supports many languages via MTA_OCR_LANG (e.g. eng+ben).

Which file types are supported? PDF (incl. scanned), DOCX, XLSX/XLS, PPTX, HTML, EPub, Outlook .msg, RTF, CSV/TSV, JSON/XML, Markdown/text, images (PNG/JPG/…), and audio (WAV/MP3/M4A/…).

✅ Quality & testing

Exercised hard: a multi-format corpus (Office, PDF, scanned PDF, OCR images, audio), a growing regression suite, green CI on three OSes, and repeated multi-agent + GitHub Copilot reviews covering accuracy, reliability, token-safety, reusability, cross-platform, and security. The token-free guarantee is enforced (recall slices are hard-capped) and the digest never returns document contents to the model. A committed offline eval harness (eval/run_eval.py) digests a reference corpus and gates retrieval recall@k in CI, so quality regressions fail the build.

🙏 Acknowledgements

Built on the shoulders of excellent open-source work — see ACKNOWLEDGEMENTS.md. In particular: Microsoft MarkItDown, Ollama, Tesseract, OpenAI Whisper / faster-whisper / Apple MLX, NetworkX, Leiden / igraph, and Cytoscape.js. Design inspiration from graphify and the author's own markitdown-mcp and mnemo-mcp.

📄 License

MIT © 2026 Aninda Sundar Howlader (GRU-953).


Keywords: Claude · MCP · Model Context Protocol · MCP server · Claude Desktop · Claude Code · local RAG · knowledge graph · GraphRAG · document memory · chat with your documents · token-free · offline · privacy · Ollama · MarkItDown · OCR · PDF to Markdown · Word/Excel/PowerPoint to Markdown · vector search · mind map · Apple silicon



If this is useful, a ⭐ helps others find it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memorised_them_all-1.4.0.tar.gz (297.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memorised_them_all-1.4.0-py3-none-any.whl (292.4 kB view details)

Uploaded Python 3

File details

Details for the file memorised_them_all-1.4.0.tar.gz.

File metadata

  • Download URL: memorised_them_all-1.4.0.tar.gz
  • Upload date:
  • Size: 297.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for memorised_them_all-1.4.0.tar.gz
Algorithm Hash digest
SHA256 829fed49546fdc6f64046d99844a7603bc2ddb9288b000ead77f31ff6a7ce852
MD5 0f8514e6ddfc91f4db6ecdf36ba71504
BLAKE2b-256 27a64cffea8541efe0a52321a93ed81b8e14f9f08dc9b4e50fab8006fc86a289

See more details on using hashes here.

Provenance

The following attestation bundles were made for memorised_them_all-1.4.0.tar.gz:

Publisher: release.yml on GRU-953/memorised-them-all

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file memorised_them_all-1.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for memorised_them_all-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae98e675172af44185ed48343be7b07b875afaf932ebbea256b60c8b5928e154
MD5 e2b1ec05d6b5dee0a9b0fcb6baeeadb1
BLAKE2b-256 444d8fd2c6dd864eb21b6695a9f9de660a9585c2b6782d979f7d91264ce7e057

See more details on using hashes here.

Provenance

The following attestation bundles were made for memorised_them_all-1.4.0-py3-none-any.whl:

Publisher: release.yml on GRU-953/memorised-them-all

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page