Local, token-free file digestion → knowledge-graph memory for Claude. Convert any attachment to Markdown and digest it into a graph + exportable memory, entirely on-device. Tuned for Apple silicon.
Project description
Memorised them All
Convert any attachment to Markdown and digest it into token-free knowledge-graph memory for Claude — 100% locally.
by GRU-953 · Claude Desktop + Claude Code · free & open-source · runs entirely on your machine
The idea. Claude tokens are expensive; your Mac's compute is free. So every heavy step — converting documents, extracting knowledge, building the graph, embedding, summarising — runs locally. Claude only ever issues a tiny tool call and gets back compact metadata or a small, relevant slice — never whole documents. Digesting a 500-page folder costs roughly zero context tokens.
Point it at a folder. It converts every attachment to Markdown, then digests it into a layered knowledge graph — a global synopsis, per-theme summaries, per-document notes, an exportable Markdown bundle, and an offline interactive mind map — and lets Claude recall from it for next to nothing.
Contents
Why it's token-free · What you get · How it works · Install · Use it · Tools · Configuration · Apple silicon · Privacy · FAQ · Acknowledgements
Why it's token-free
Most "chat with your docs" tools stream document text into the model — you pay tokens to ingest and to recall. Memorised them All never does that:
| Step | Where it runs | Tokens to Claude |
|---|---|---|
| Convert PDF/Office/image/audio → Markdown | your Mac (MarkItDown, Tesseract, Whisper, Ollama) | 0 |
| Extract entities · relations · facts | your Mac (local LLM, classical fallback) | 0 |
| Embed · resolve · build graph · summarise themes | your Mac (Ollama + NetworkX) | 0 |
digest tool result |
— | only counts & paths |
recall tool result |
— | a small, relevant slice (not documents) |
What you get
- 📄 Universal local conversion — PDF, Word, Excel, PowerPoint, HTML, EPub,
Outlook
.msg, CSV/JSON/XML, images (OCR + vision captioning), audio (on-device transcription) → clean Markdown. - 🕸️ A layered knowledge graph — entities, typed relations and atomic facts, grouped into communities (themes) with local summaries. Three memory layers: global synopsis → theme summaries → provenance-tracked facts.
- 📝 Exportable Markdown memory —
memory.md, one note per source document, andgraph.json— copy them anywhere. - 🧭 Offline interactive mind map — a single self-contained
mindmap.html(Cytoscape inlined, no network) you can open in any browser. - 🔁 Reusable, named projects — keep separate memories per body of work.
- ⚙️ Auto-installing & auto-updating — pulls the latest MarkItDown from upstream and keeps dependencies current. Starts the model server on demand and stops it after 5 minutes idle.
How it works
attachments ─► CONVERT ─► SEGMENT ─► EMBED ─► EXTRACT ─► RESOLVE ─► GRAPH + COMMUNITIES ─► MATERIALISE
pdf/docx/ MarkItDown structure nomic- local LLM canonical NetworkX + graph.json
xlsx/img/ +Tesseract +semantic embed- triples + entities Leiden / Louvain memory.md
audio/... +Whisper chunking text facts (embed + community summaries memory/<doc>.md
+Ollama (+class- fuzzy) (local LLM) mindmap.html
vision ical vectors store
fallback)
│
recall("…") ◄── embed query locally ── return ONLY a small relevant slice (themes + facts + provenance)
Everything between attachments and recall happens on your machine. The local LLM step has a dependency-free classical fallback, so a digest always succeeds — even offline, even before any model is downloaded — and gets sharper once the models are present.
Install
Requirements: macOS (Apple silicon recommended) · Homebrew · Python ≥ 3.10. The installer fetches everything else for you.
Claude Desktop (one click)
- Download
memorised-them-all.mcpbfrom the latest release. - Double-click it (or Settings → Extensions → Install).
- On first launch it bootstraps the local stack automatically.
Claude Code (CLI)
/plugin marketplace add GRU-953/memorised-them-all
/plugin install memorised-them-all
As a plain CLI / from PyPI
pip install memorised-them-all # installs the `mta` command
mta status # check the local stack
mta digest ~/Documents/research # build memory from a folder
From source / Homebrew
git clone https://github.com/GRU-953/memorised-them-all
cd memorised-them-all && ./install.sh # idempotent: brew apps + venv + models
# or via the tap
brew install GRU-953/memorised-them-all/mta
The installer adds (only what's missing): Ollama, Tesseract (+ all OCR
languages), ffmpeg, a Python virtualenv with the latest MarkItDown from
upstream, and the local models qwen2.5:7b, nomic-embed-text, moondream
(~6–8 GB, configurable — pulled in the background).
Use it
In Claude (Desktop or Code), just ask:
"Memorise everything in
~/Documents/grant-proposals." "What did my documents say about the Aurora timeline?" "Open the mind map." · "Export the memory to~/Desktop/aurora-memory."
Or use the slash commands: /memorise, /recall, /memory-map,
/memory-status, /export-memory.
From the terminal:
mta digest ~/Documents/research --project aurora
mta recall "who leads the project and who are the partners?" --project aurora
mta overview --project aurora
mta mindmap --project aurora --open
mta export ~/Desktop/aurora-memory --project aurora
mta update # pull the latest MarkItDown + dependencies
MCP tools
| Tool | What it does | Returns |
|---|---|---|
digest(paths, project?, reset?) |
convert + digest files/dirs/globs (accumulates into the project; reset=true starts fresh) |
counts, paths, graph stats |
recall(query, project?, k?) |
answer from memory | a small, citable slice |
memory_overview(project?) |
synopsis + themes | compact overview |
export_memory(dest, project?) |
export portable Markdown | files written |
list_digestible(directory) |
list convertible files | paths + sizes |
memory_status() |
local stack health | versions, models, projects |
open_mindmap(project?) |
offline mind map | file path |
Every result is metadata or a small slice — document contents never return to the conversation.
Configuration
All optional; sensible defaults. Set via environment (CLI) or the extension settings (Desktop).
| Variable | Default | Meaning |
|---|---|---|
MTA_HOME |
~/.memorised-them-all |
where memories are stored |
MTA_EXTRACT_MODEL |
qwen2.5:7b |
local LLM for extraction & summaries |
MTA_EMBED_MODEL |
nomic-embed-text |
local embedding model |
MTA_VISION_MODEL |
moondream |
image captioning |
MTA_OCR_LANG |
eng |
Tesseract languages, e.g. eng+ben |
MTA_WHISPER_MODEL |
base |
on-device transcription model |
MTA_IDLE |
300 |
seconds of idle before Ollama is stopped |
MTA_WORKERS |
0 (auto) |
parallel conversion workers |
MTA_EXTRACT_WORKERS |
0 (auto) |
parallel extraction workers (memory-aware: 1–3 by RAM) |
MTA_MAX_CHUNKS |
1500 |
safety cap on chunks per digest (truncation is reported) |
MTA_MAX_FILE_MB |
200 |
skip files larger than this before reading (0 disables) |
MTA_COMMUNITY_ALGO |
auto |
leiden · louvain · greedy |
MTA_AUTO_UPDATE |
on |
auto-update MarkItDown & dependencies |
MTA_FAST |
off |
fast mode — skip the LLM (classical extraction, deterministic, keeps embeddings) |
MTA_NO_OLLAMA |
unset | hard offline switch (classical + hashing) |
Accuracy vs speed. The default path uses the local LLM for the highest extraction accuracy. Fast mode (
MTA_FAST=on,mta digest --fast, or thefast=truetool arg) skips the LLM for a fully deterministic, ~100× faster digest that still builds the graph and keeps semantic recall — ideal for large or frequently-updated corpora.
Apple silicon first
- Conversion fans out across performance cores (
hw.perflevel0.physicalcpu), with each worker's native math libraries pinned to one thread to avoid oversubscription on the unified-memory architecture. - GPU-accelerated Whisper via Apple MLX (
mlx-whisper), with afaster-whisperCPU fallback. - Worker count is unified-memory-aware so it won't thrash a 16 GB Mac.
It runs on Intel Macs and Linux too — those paths simply use portable defaults.
Privacy
100% local. No cloud APIs, no telemetry, no API keys. Your documents, the graph,
the embeddings, and the memory files never leave your machine. The only network
access is (a) downloading open-source dependencies/models on install and
(b) the once-a-day dependency update check (disable with MTA_AUTO_UPDATE=off).
FAQ
Does it really cost no tokens? Conversion and digestion cost zero Claude
tokens. recall returns a small slice (a handful of summaries/facts), so answers
are cheap — far cheaper than pasting documents into the chat.
What if I have no GPU / no models / I'm offline? It still works. The classical extractor and hashing embeddings keep the pipeline running; quality improves once Ollama and the models are available.
Is my existing Ollama affected? No. If Ollama is already running, it's reused and left alone. Only an instance this tool starts is stopped on idle.
Where are my files? Under MTA_HOME/projects/<project>/ — graph.json,
memory.md, memory/, mindmap.html. export_memory copies them anywhere.
Modes & performance
Two digest modes — the default favours accuracy & consistency, fast mode favours speed & determinism:
| Default (accurate) | Fast (--fast / MTA_FAST=on) |
|
|---|---|---|
| Extraction | local LLM (qwen2.5) | classical (deterministic) |
| Theme summaries | local LLM | deterministic fact-join |
| Embeddings / recall | local (nomic) | local (nomic) |
| Reproducible | per-model | byte-identical across runs |
| Relative speed | baseline | ~100× faster |
| Best for | highest fidelity | large or frequently-refreshed corpora |
Both are token-free and fully local. Digestion is incremental — pointing digest at another folder extends the same project; reset=true starts fresh. Degenerate/repetitive content is de-duplicated and a reported MTA_MAX_CHUNKS cap keeps even pathological inputs bounded.
Platform support
Apple M-series is the primary, most-optimised target. Other platforms are supported with portable fallbacks:
| Platform | Status | Notes |
|---|---|---|
| macOS (Apple silicon) | ✅ optimised | performance-core pool, MLX GPU Whisper, unified-memory-aware |
| macOS (Intel) | ✅ supported | physical-core sizing via psutil, CPU Whisper |
| Linux | ✅ supported | apt/dnf/pacman install paths, CUDA Whisper if a GPU is present |
| Windows | 🧪 experimental | pip install memorised-them-all + run mta serve; psutil process management & PATH healing |
CI runs the offline test suite across Ubuntu, macOS, and Windows on Python 3.10 & 3.12.
Generated files & reuse
Each project under MTA_HOME/projects/<name>/ is self-contained and portable:
| File | What it is |
|---|---|
graph.json |
source of truth — nodes, edges, communities, layered summaries, stats (version-stamped; stores basenames, no absolute paths) |
memory.md |
compact, layered digest for reading / pasting |
memory/<doc>.md |
one note per source document |
mindmap.html |
offline interactive graph (Cytoscape inlined) |
vectors.npz + vectors.json |
local embeddings for recall |
A memory built once can be copied to another machine and reused read-only — recall and the mind map work with no rebuild. export_memory bundles all of the above (including the vector store) into a folder you choose.
Quality & testing
This project is exercised hard: a multi-format corpus (Office, PDF, scanned PDF, OCR images, audio), 14 regression tests (determinism, token-safety, fact attribution, accumulation, OCR, lifecycle, cross-platform), green CI on three OSes, and a multi-agent review pass covering accuracy, reliability, token-safety, reusability, cross-platform, and security. The token-free guarantee is enforced (recall slices are hard-capped) and the digest never returns document contents to the model.
Acknowledgements
Built on the shoulders of excellent open-source work — see ACKNOWLEDGEMENTS.md. In particular: Microsoft MarkItDown, Ollama, Tesseract, OpenAI Whisper / faster-whisper / Apple MLX, NetworkX, Leiden / igraph, and Cytoscape.js. Design inspiration from graphify and the author's own markitdown-mcp and mnemo-mcp.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file memorised_them_all-1.2.0.tar.gz.
File metadata
- Download URL: memorised_them_all-1.2.0.tar.gz
- Upload date:
- Size: 169.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4cc47e53a32ca81ba4d8fd833ab5ab35fb1aaa3d74d4d890b20c783c0fa172b7
|
|
| MD5 |
a4917b48d0fb1b0bcf0459b1c7181272
|
|
| BLAKE2b-256 |
5c4e15d49f6a5776124963132441604518bdea0a20b56c5457dc232eb41e6bb4
|
File details
Details for the file memorised_them_all-1.2.0-py3-none-any.whl.
File metadata
- Download URL: memorised_them_all-1.2.0-py3-none-any.whl
- Upload date:
- Size: 173.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
707008693e1f771fc4b1df890c68510f94d702af67c190909af049dc78b36e17
|
|
| MD5 |
10c9a1d6882baefa27dc189741cd8278
|
|
| BLAKE2b-256 |
55f8a7f2cff714349673d9e8a62119fcfe840af54580d1680b33f6f193016466
|