Local, token-free file digestion → knowledge-graph memory for Claude. Convert any attachment to Markdown and digest it into a graph + exportable memory, entirely on-device. Tuned for Apple silicon.
Project description
Memorised them All
Local, token-free document memory for Claude — turn any folder of files into a knowledge graph you can recall, for ~0 context tokens.
An MCP server & plugin for Claude Desktop and Claude Code that converts PDFs, Office docs, images, and audio to Markdown on your machine, then digests them into a searchable knowledge graph + mind map — privately, with no cloud and no API keys.
Quickstart · Why token-free · Features · How it works · Tools · Use cases · Comparison · Config · Platforms · Privacy · FAQ
100% local · free & open-source · auto-installing · Apple-silicon optimised · by GRU-953
The idea in one line: Claude tokens are expensive; your computer's compute is free. So every heavy step — converting documents, extracting knowledge, embedding, summarising — runs locally, and Claude only ever gets back a tiny tool result. Digesting a 500-page folder costs roughly zero context tokens.
Point Memorised them All at a folder. It converts every attachment to Markdown locally, then builds a layered knowledge graph — a global synopsis, per-theme summaries, per-document notes, an exportable Markdown bundle, and an offline interactive mind map — and lets Claude recall from it for next to nothing.
"Memorise everything in ~/Documents/research."
"What did my documents say about the Q3 budget?"
"Open the mind map."
🚀 Quickstart
| Claude Desktop | Claude Code |
|---|---|
|
Download |
/plugin marketplace add GRU-953/memorised-them-all
/plugin install memorised-them-all
|
| pip (any OS) | Homebrew (macOS/Linux) |
pip install memorised-them-all
mta status
mta digest ~/Documents/research
|
brew install GRU-953/memorised-them-all/mta
|
Requirements: Python ≥ 3.10. The installer fetches everything else (Ollama, Tesseract, ffmpeg, the latest MarkItDown, and local models) — see Platform support.
💡 Why token-free
Most "chat with your documents" tools stream document text into the model — you pay tokens to ingest and to recall. Memorised them All never does that:
| Step | Where it runs | Tokens to Claude |
|---|---|---|
| Convert PDF / Office / image / audio → Markdown | your machine (MarkItDown · Tesseract · Whisper · Ollama) | 0 |
| Extract entities · relations · facts | your machine (local LLM, classical fallback) | 0 |
| Embed · resolve · build graph · summarise | your machine (Ollama · NetworkX) | 0 |
digest result |
— | counts & paths only (~140 tokens) |
recall result |
— | a small, citable slice — never the documents |
Tool results are hard-capped in size, so the guarantee holds even on the high-accuracy path.
✨ Features
- 📄 Universal local conversion — PDF, Word, Excel, PowerPoint, HTML, EPub, Outlook
.msg, CSV/JSON/XML, images (OCR + vision captioning), and audio (on-device transcription) → clean Markdown. Scanned PDFs are OCR'd; 163 OCR languages. - 🕸️ Layered knowledge graph (GraphRAG-style) — entities, typed relations and atomic facts, grouped into themes by community detection, with a global synopsis and per-theme summaries — all built by local models.
- 🧭 Offline interactive mind map — a single self-contained
mindmap.html(Cytoscape inlined, zero network). - 📝 Exportable, portable memory —
graph.json,memory.md, and per-document notes you can copy to any machine and reuse. - ⚡ Two modes — high-accuracy (local LLM) and fast mode (
--fast): deterministic and ~100× faster for large or frequently-refreshed corpora. - 🔁 Reusable named projects — accumulate many folders into one memory;
forgetto delete one. - 🍎 Apple-silicon first — performance-core parallelism, GPU Whisper via MLX, unified-memory-aware concurrency. Runs on Intel macOS, Linux, and Windows too.
- ⚙️ Auto-installing & auto-updating — pulls the latest MarkItDown from upstream; starts the model server on demand and stops it after 5 minutes idle.
- 🔒 Private by design — no cloud, no API keys, no telemetry. Your files never leave your computer.
🧠 How it works
attachments ─► CONVERT ─► SEGMENT ─► EMBED ─► EXTRACT ─► RESOLVE ─► GRAPH + COMMUNITIES ─► MATERIALISE
pdf/docx/ MarkItDown structure nomic- local LLM canonical NetworkX + graph.json
xlsx/img/ +Tesseract +semantic embed- triples + entities Leiden / Louvain memory.md
audio/... +Whisper chunking text facts (embed + community summaries memory/<doc>.md
+Ollama (+class- fuzzy + (local LLM) mindmap.html
vision ical acronym) vectors store
fallback)
│
recall("…") ◄── embed query locally ── return ONLY a small, citable slice (themes + facts + provenance)
Everything between attachments and recall happens on your machine. The local-LLM step has a dependency-free classical fallback, so a digest always succeeds — even offline, even before any model is downloaded — and gets sharper once models are present.
🛠 Tools
Seven token-free MCP tools (plus the mta CLI). Every result is metadata or a small slice — document contents never return to the conversation.
| Tool | What it does |
|---|---|
digest(paths, project?, reset?, fast?) |
convert + digest files/dirs/globs; accumulates into the project (reset starts fresh, fast skips the LLM) |
recall(query, project?, k?) |
answer from memory — a small, citable slice |
memory_overview(project?) |
synopsis + themes |
export_memory(dest, project?) |
export portable Markdown + graph + mind map |
list_digestible(directory) |
list convertible files (paths/sizes only) |
memory_status() |
local stack health (Ollama, models, Tesseract, MarkItDown version) |
open_mindmap(project?) |
path to the offline interactive mind map |
forget(project?) |
delete a project's memory |
CLI: mta digest <paths> [--fast] [--reset] · mta recall "<q>" · mta overview · mta export <dir> · mta mindmap --open · mta forget · mta status · mta update. In Claude Code, the slash commands /memorise, /recall, /memory-map, /memory-status, /export-memory are also available.
🎯 Use cases
- Private RAG / "chat with your documents" — locally, with no cloud and no per-query token cost.
- Research & literature memory — digest a folder of papers/PDFs and ask grounded questions.
- Knowledge base for an agent — give Claude durable, reusable memory across sessions.
- Meeting & audio notes — transcribe recordings on-device and fold them into the graph.
- Scanned documents & receipts — OCR image-only PDFs and images into searchable memory.
- Visual exploration — open the offline mind map to see how entities and themes connect.
⚖️ Comparison
| Memorised them All | Cloud "chat with docs" / hosted RAG | Stock markitdown-mcp |
|
|---|---|---|---|
| Runs fully locally | ✅ | ❌ | ✅ (conversion only) |
| Context-token cost to ingest and recall | ~0 | high | high (returns text) |
| Knowledge graph + themes | ✅ | sometimes | ❌ |
| Offline interactive mind map | ✅ | ❌ | ❌ |
| Works offline / no API keys | ✅ | ❌ | ✅ |
| Reusable, exportable memory files | ✅ | varies | ❌ |
| Free & open-source (MIT) | ✅ | varies | ✅ |
⚙️ Configuration
All optional, sensible defaults; set via environment (CLI) or the extension settings (Desktop).
| Variable | Default | Meaning |
|---|---|---|
MTA_HOME |
~/.memorised-them-all |
where memories are stored |
MTA_FAST |
off |
fast mode — skip the LLM (deterministic, ~100× faster) |
MTA_EXTRACT_MODEL |
qwen2.5:7b |
local LLM for extraction & summaries |
MTA_EMBED_MODEL |
nomic-embed-text |
local embedding model |
MTA_VISION_MODEL |
moondream |
image captioning |
MTA_OCR_LANG |
eng |
Tesseract languages, e.g. eng+ben |
MTA_WHISPER_MODEL |
base |
on-device transcription model |
MTA_IDLE |
300 |
seconds idle before Ollama is stopped |
MTA_WORKERS / MTA_EXTRACT_WORKERS |
0 (auto) |
parallel conversion / extraction workers |
MTA_MAX_CHUNKS / MTA_MAX_FILE_MB |
1500 / 200 |
workload & input-size caps (reported) |
MTA_AUTO_UPDATE |
on |
auto-update MarkItDown & dependencies |
MTA_NO_OLLAMA |
unset | hard offline switch (classical + hashing) |
💻 Platform support
Apple M-series is the primary, most-optimised target; other platforms use portable fallbacks.
| Platform | Status | Notes |
|---|---|---|
| macOS (Apple silicon) | ✅ optimised | performance-core pool, MLX GPU Whisper, unified-memory-aware |
| macOS (Intel) | ✅ supported | physical-core sizing via psutil, CPU Whisper |
| Linux | ✅ supported | apt/dnf/pacman install paths, CUDA Whisper if a GPU is present |
| Windows | 🧪 experimental | pip install memorised-them-all + mta serve (or python launch.py). The .mcpb bundle is macOS/Linux only. |
CI runs the test suite across Ubuntu, macOS, and Windows on Python 3.10 & 3.12.
📦 Generated files & reuse
Each project under MTA_HOME/projects/<name>/ is self-contained and portable:
| File | What it is |
|---|---|
graph.json |
source of truth — nodes, edges, communities, layered summaries (version-stamped, no absolute paths) |
memory.md |
compact, layered digest for reading / pasting |
memory/<doc>.md |
one note per source document |
mindmap.html |
offline interactive graph (Cytoscape inlined) |
vectors.npz + vectors.json |
local embeddings for recall |
A memory built once can be copied to another machine and reused read-only. export_memory bundles all of the above (including the vector store) into a folder you choose.
🔒 Privacy & security
100% local — no cloud APIs, no telemetry, no API keys. Your documents, the graph, the embeddings, and the memory files never leave your machine. The only network access is (a) downloading open-source dependencies/models on install and (b) a throttled once-a-day dependency-update check (disable with MTA_AUTO_UPDATE=off).
Hardened for processing untrusted files: argv-only subprocesses (no curl | sh), path-safe and collision-free output names, per-file size + decompression-bomb caps, prompt-injection data-delimiting, allow_pickle=False, and hard-capped recall results. See the full threat model in the docs.
❓ FAQ
Does it really cost no tokens? Conversion and digestion cost zero Claude tokens. recall returns a small, hard-capped slice (a few summaries/facts), so answers are far cheaper than pasting documents into chat.
Do I need a GPU or downloaded models? No. The classical extractor and hashing embeddings keep the pipeline working with no models and offline; quality improves once Ollama and the models are present.
Where are my files? Under ~/.memorised-them-all/projects/<project>/. export_memory copies them anywhere.
Is my existing Ollama affected? No — a running Ollama is reused and left alone. Only an instance this tool starts is stopped on idle.
What's "fast mode"? --fast (or MTA_FAST=on) skips the local LLM for a fully deterministic, ~100× faster digest that still builds the graph and keeps semantic recall — ideal for large or frequently-updated corpora.
Which file types are supported? PDF (incl. scanned), DOCX, XLSX/XLS, PPTX, HTML, EPub, Outlook .msg, RTF, CSV/TSV, JSON/XML, Markdown/text, images (PNG/JPG/…), and audio (WAV/MP3/M4A/…).
✅ Quality & testing
Exercised hard: a multi-format corpus (Office, PDF, scanned PDF, OCR images, audio), a growing regression suite, green CI on three OSes, and repeated multi-agent + GitHub Copilot reviews covering accuracy, reliability, token-safety, reusability, cross-platform, and security. The token-free guarantee is enforced (recall slices are hard-capped) and the digest never returns document contents to the model.
🙏 Acknowledgements
Built on the shoulders of excellent open-source work — see ACKNOWLEDGEMENTS.md. In particular: Microsoft MarkItDown, Ollama, Tesseract, OpenAI Whisper / faster-whisper / Apple MLX, NetworkX, Leiden / igraph, and Cytoscape.js. Design inspiration from graphify and the author's own markitdown-mcp and mnemo-mcp.
📄 License
MIT © 2026 Aninda Sundar Howlader (GRU-953).
Keywords: Claude · MCP · Model Context Protocol · MCP server · Claude Desktop · Claude Code · local RAG · knowledge graph · GraphRAG · document memory · chat with your documents · token-free · offline · privacy · Ollama · MarkItDown · OCR · PDF to Markdown · Word/Excel/PowerPoint to Markdown · vector search · mind map · Apple silicon
If this is useful, a ⭐ helps others find it.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file memorised_them_all-1.3.2.tar.gz.
File metadata
- Download URL: memorised_them_all-1.3.2.tar.gz
- Upload date:
- Size: 279.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56a0adf4c0a8248e4e855809ee38feae5fd78c30422a42b7ba08bf42b4642e89
|
|
| MD5 |
08d44d09977da5d4aa52bedeea7ca523
|
|
| BLAKE2b-256 |
c2ac08f6b91d0dcf1a170dd8d392643aeb2d48f02b6c9aaca3d3620eb2a6eaeb
|
File details
Details for the file memorised_them_all-1.3.2-py3-none-any.whl.
File metadata
- Download URL: memorised_them_all-1.3.2-py3-none-any.whl
- Upload date:
- Size: 277.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3f169ed3657d8bac812facb07c30e430c071a621ef92aa5c864947747b1c7e63
|
|
| MD5 |
b9017c22280a8e2163646f6b5d4f6c97
|
|
| BLAKE2b-256 |
6ad8b994c4aa3dbc9af04cd6a8563a58aae5a6fab64f00fae2d8c660d9ae8f57
|