Local, token-free file digestion → knowledge-graph memory for Claude. Convert any attachment to Markdown and digest it into a graph + exportable memory, entirely on-device. Tuned for Apple silicon.
Project description
Memorised them All
Give Claude a private memory of your files — without paying for it in tokens.
Point it at a folder of PDFs, Word/Excel files, images, or audio. It reads and remembers them entirely on your own computer, so you can just ask Claude about them later — no copy-pasting, no uploads, no API keys, no surprise token bills.
What is this? · Get started · First memory · Use cases · FAQ · Advanced
🧠 What is this?
Imagine you could hand Claude a filing cabinet of your documents and say "remember all of this." Later you just ask questions, and Claude answers from what it remembers — citing which document each fact came from.
That's Memorised them All. It's a small add-on (an MCP server) for Claude Desktop and Claude Code that:
- Reads your files — PDFs, Word/Excel/PowerPoint, web pages, images (with OCR), even audio — and converts them to clean text on your computer.
- Builds a memory — a searchable map of the people, topics, and facts inside them (a "knowledge graph"), plus a tidy summary and an interactive mind map.
- Lets you ask — Claude recalls just the relevant bits when you ask, instead of you pasting whole files into the chat.
The one-line idea: Claude tokens cost money; your computer's effort is free. So all the heavy lifting happens locally, and Claude only ever receives a tiny answer. Memorising a 500-page folder costs roughly zero chat tokens.
💬 See it in action
Once it's installed, you just talk to Claude normally:
You: Memorise everything in ~/Documents/contracts.
Claude: ✅ Digested 38 files → 421 facts across 7 themes. (took ~30s, all local)
You: Which contracts mention an auto-renewal clause, and when do they renew?
Claude: Three do — the Globex MSA (renews 1 Jan, 60-day notice), … [cites each source]
You: Open the mind map.
Claude: Here's your interactive map: /…/mindmap.html
Nothing left your machine. Claude never saw the 38 files — only the small answers.
🚀 Get started in about a minute
You need Python 3.10 or newer (most Macs and Linux PCs already have it; Windows users can install it from python.org — tick "Add to PATH"). Everything else installs automatically the first time you use it.
Pick whichever matches how you use Claude:
▶ Claude Desktop (easiest — no terminal)
- Download
memorised-them-all.mcpbfrom the latest release. - Double-click it. Claude Desktop opens and offers to install the extension — click Install.
- Done. Start a chat and say "Memorise my Documents folder."
▶ Claude Code
claude
# then, inside Claude Code:
/plugin marketplace add GRU-953/memorised-them-all
/plugin install memorised-them-all
▶ Any other setup (pip)
pip install memorised-them-all
Then register it with Claude — easiest is to let it configure itself:
mta setup-claude # writes the MCP server into Claude Desktop (and Claude Code) config
(The install.sh installer runs this for you automatically.) Or add it by hand — it just runs mta serve:
{
"mcpServers": {
"memorised-them-all": { "command": "mta", "args": ["serve"] }
}
}
💡 Prefer Homebrew or Docker?
brew install GRU-953/memorised-them-all/mta, or see Run it in Docker. All paths give you the same thing.
Do I need to install AI models?
No — it works the moment it's installed. Out of the box it uses fast, built-in techniques (no downloads, fully offline).
For sharper summaries and search, it can optionally use a free local AI model via Ollama (still 100% on your machine). If Ollama is present it's used automatically; if not, you're never blocked. To check what you have and get one-line setup tips, run:
mta doctor
📁 Your first memory
-
Tell Claude what to remember — point it at a folder, a file, or a pattern:
"Memorise everything in ~/Documents/research."
(Behind the scenes Claude calls the
digesttool. The first run may take a little longer while it sets things up.) -
Ask away — in plain language:
"What do my documents say about the Q3 budget?" "Summarise everything about Project Apollo." "Who is mentioned most often, and in which files?"
-
Explore visually (optional):
"Open the mind map." — an interactive, offline map of how everything connects.
-
Keep it tidy — separate memories per topic with projects:
"Memorise ~/work/clientA into a project called clientA." "Using the clientA project, what were the agreed deliverables?"
Your memory lives in a folder on your computer (~/.memorised-them-all by default) and persists between chats. Re-running "memorise" updates it.
🎯 What can I use it for?
- 📚 Research & study — digest a pile of papers or a textbook, then ask for explanations, comparisons, and citations.
- 📑 Contracts & policies — load all your agreements and ask "which ones auto-renew?" or "what are the termination clauses?"
- 🗂️ Personal knowledge base — point it at years of notes, receipts, or manuals and actually find things.
- 🎧 Meetings & lectures — drop in audio recordings; it transcribes locally and remembers the content.
- 🖼️ Scanned documents & images — it reads text from photos and scans (OCR) so they become searchable.
- 🔒 Sensitive material — legal, medical, financial, or confidential files that must never leave your machine.
✨ Why is it free of token cost?
When you normally share a document with Claude, the whole thing is sent into the conversation — and you pay (in tokens) for every word, every time. A few big PDFs can blow your whole context window.
Memorised them All flips that around:
- Converting, reading, and summarising your files happens on your computer.
- Claude only ever receives a tiny result — a count, a short summary, or a few relevant snippets (capped small).
- So memorising a giant folder, and asking about it again and again, stays near-zero context tokens.
It's the difference between mailing someone an entire library versus asking a librarian a question.
🔒 Is my data private?
Yes — that's the whole point. By default:
- ✅ 100% local. Your files are read, converted, and remembered on your own machine. Their contents are never sent to Claude's servers, to us, or to anyone.
- ✅ No telemetry, no tracking, no accounts, no API keys.
- ✅ Works fully offline. Disconnect the internet and it still memorises and answers.
- ✅ Open source (MIT). You (or anyone) can read exactly what it does.
The only times anything touches the network are clearly optional and on your command: installing/updating software, an occasional check for a new version (turn off with MTA_AUTO_UPDATE=off), or if you explicitly choose to point it at a remote AI backend. With the defaults, your documents stay with you. See SECURITY.md for the full threat model.
❓ Questions & troubleshooting
Is it really free?
Yes — the software is free and open-source (MIT), and it runs on your own computer, so there are no per-use fees or token charges. The only "cost" is a little of your computer's time and disk space.
Claude says it doesn't have the tool / it's not showing up.
Make sure the extension/plugin is installed and enabled, then fully restart Claude Desktop (or your Claude Code session). To confirm the engine itself works, run mta status in a terminal — it should print your setup. Still stuck? Run mta doctor.
The first "memorise" was slow.
The first run sets things up (and, if you have Ollama, may load a model). Later runs are much faster, and re-memorising only processes what changed. Add fast ("memorise … in fast mode") to skip the AI step entirely for a quick, fully-deterministic pass.
What files can it read?
PDFs, Word/Excel/PowerPoint, plain text/Markdown, HTML, CSV/JSON/XML, RTF, EPUB, common images (via OCR), and audio (transcribed locally). Beyond those, any other text-based file is digested too (source code, .log, .ini, .tex, …); only genuine binaries are skipped — so a whole folder gets captured. Ask Claude to "list what's digestible in this folder" to see.
What languages does it understand?
Text in any language works (it's Unicode throughout). For scanned documents and images, OCR runs English + Bangla by default (eng+ben); set MTA_OCR_LANG to other Tesseract codes (e.g. eng+hin+ara). Any language pack you don't have installed is dropped automatically, so it never errors.
How do I delete a memory?
Tell Claude "forget the clientA project" (it asks you to name the project, on purpose). It deletes that project's memory from your disk — irreversibly.
Does it need an internet connection?
No. It's built to work completely offline. Internet is only used for optional, opt-in things like installing updates.
🧰 The tools Claude gets
Once installed, Claude can use these eight tools on your behalf (you just talk normally — Claude picks the right one):
| Tool | What it does for you |
|---|---|
| digest | Reads files/folders and builds (or updates) the memory. |
| recall | Answers a question from memory with a few relevant, cited snippets. |
| memory_overview | Gives the big picture — a synopsis and the main themes. |
| list_digestible | Shows which files in a folder it can read. |
| open_mindmap | Opens the interactive, offline mind map. |
| export_memory | Saves the memory as portable Markdown notes you can keep or share. |
| memory_status | Reports your local setup (models, tools, projects). |
| forget | Deletes a project's memory (you name it explicitly). |
Every tool returns only small results — never your documents' contents.
🛠️ For power users
You don't need any of this to use the app — but it's here if you want it.
Command line (no Claude needed)
The same engine ships as an mta command:
mta digest ~/Documents/research # build/update memory (--fast to skip the LLM)
mta recall "what about the Q3 budget?" # query it
mta overview # synopsis + themes
mta status # local stack health · mta doctor (fix deps)
mta export ./notes # export portable Markdown
mta mindmap --open # open the mind map
Use it from other AI apps (OpenAI, Gemini, plain HTTP)
The same eight tools can be served beyond Claude:
mta serve --http # MCP over HTTP (loopback + an auto-generated bearer token)
mta serve --rest # plain JSON: POST http://127.0.0.1:8765/tools/<name>
mta export-schema # tool schemas as OpenAI / Gemini / OpenAPI 3.1 (no drift)
mta recipes # copy-paste connection snippets for every client
Both HTTP modes are loopback-only by default and require a bearer token. See mta recipes for ready-to-paste setup.
Run it in Docker
A multi-arch image (amd64 + arm64) is published to GHCR:
docker run -d --name mta -p 127.0.0.1:8765:8765 -v mta-data:/data \
ghcr.io/gru-953/memorised-them-all:latest
docker logs mta # copy the printed bearer token + the `claude mcp add …` line
It serves the tools over MCP HTTP and keeps memory in the /data volume. Mount documents read-only (-v /path/to/docs:/docs:ro) and digest the in-container path.
Use a different (or remote) AI model
By default the optional AI step runs on local Ollama. To use another local server (LM Studio, llama.cpp, vLLM, …) set MTA_BACKEND:
MTA_BACKEND=lmstudio mta digest ~/docs # OpenAI-compatible server on :1234
MTA_BACKEND=openai MTA_BACKEND_URL=http://127.0.0.1:8080/v1 mta digest ~/docs
Set MTA_EXTRACT_MODEL / MTA_EMBED_MODEL to that server's model names. Pointing it at a non-local URL sends content off your machine — that's your explicit choice (you'll get a one-time warning).
Configuration
Everything has sensible defaults. Common knobs (set as environment variables):
| Variable | Default | Meaning |
|---|---|---|
MTA_HOME |
~/.memorised-them-all |
where memory is stored |
MTA_OCR_LANG |
eng+ben |
OCR languages (Tesseract codes; missing packs dropped automatically) |
MTA_NO_OLLAMA |
unset | force fully-offline mode (no AI model) |
MTA_AUTO_UPDATE |
on |
daily update check (off to disable) |
MTA_PROFILE |
unset | tuning preset: laptop · workstation · server · offline |
MTA_BACKEND / MTA_BACKEND_URL |
auto |
use another local model server (see above) |
MTA_HTTP_* |
off | options for the opt-in HTTP/REST servers |
How it works under the hood
convert (files → Markdown, locally) → extract (entities, relations, facts) → graph (build + detect communities/themes) → summarise (layered: per-theme + a global synopsis) → embed (vectors for search) → materialise (memory.md, per-doc notes, mind map). recall embeds your question and returns the closest, capped, cited snippets. With no AI model available it falls back to fast classical techniques, so a digest always succeeds. See CHANGELOG.md and SECURITY.md for details.
💻 Platforms
macOS (Apple-silicon optimised), Linux, and Windows · Python 3.10–3.12 · tested on all three in CI.
🙏 Credits & license
Built on the shoulders of MarkItDown, Ollama, NetworkX, and the Model Context Protocol. Optional community-detection extras (python-igraph, leidenalg) are GPL-licensed and not installed by the MIT core. See ACKNOWLEDGEMENTS.md.
MIT licensed · made by GRU-953. Issues and contributions welcome — start with SECURITY.md for the security model.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file memorised_them_all-1.6.2.tar.gz.
File metadata
- Download URL: memorised_them_all-1.6.2.tar.gz
- Upload date:
- Size: 321.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06447fb2c1b3c5db54ea55e191914b14b1b3f88801c8a53b359529f349373071
|
|
| MD5 |
2c4848cad51a47df0f345e9bf440b8dc
|
|
| BLAKE2b-256 |
ed470ff0a50d832e52d29c870ab7d9ed231dca2f917f549c2586c222a3f65db2
|
Provenance
The following attestation bundles were made for memorised_them_all-1.6.2.tar.gz:
Publisher:
release.yml on GRU-953/memorised-them-all
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
memorised_them_all-1.6.2.tar.gz -
Subject digest:
06447fb2c1b3c5db54ea55e191914b14b1b3f88801c8a53b359529f349373071 - Sigstore transparency entry: 1735693616
- Sigstore integration time:
-
Permalink:
GRU-953/memorised-them-all@0ac38f517bb8c359cdd1cf01ea98c5ea568e7b52 -
Branch / Tag:
refs/tags/v1.6.2 - Owner: https://github.com/GRU-953
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0ac38f517bb8c359cdd1cf01ea98c5ea568e7b52 -
Trigger Event:
push
-
Statement type:
File details
Details for the file memorised_them_all-1.6.2-py3-none-any.whl.
File metadata
- Download URL: memorised_them_all-1.6.2-py3-none-any.whl
- Upload date:
- Size: 316.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
23452c9d4d942858ca51c3917e9ac613593e3f27f22e1568d08650b145064c5a
|
|
| MD5 |
d6241a803e76d9f3953e87fb0a30e1b3
|
|
| BLAKE2b-256 |
a5546e5b252779d0c8b0e6a408485ab6975265079868dc22362fea260c24583b
|
Provenance
The following attestation bundles were made for memorised_them_all-1.6.2-py3-none-any.whl:
Publisher:
release.yml on GRU-953/memorised-them-all
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
memorised_them_all-1.6.2-py3-none-any.whl -
Subject digest:
23452c9d4d942858ca51c3917e9ac613593e3f27f22e1568d08650b145064c5a - Sigstore transparency entry: 1735693649
- Sigstore integration time:
-
Permalink:
GRU-953/memorised-them-all@0ac38f517bb8c359cdd1cf01ea98c5ea568e7b52 -
Branch / Tag:
refs/tags/v1.6.2 - Owner: https://github.com/GRU-953
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0ac38f517bb8c359cdd1cf01ea98c5ea568e7b52 -
Trigger Event:
push
-
Statement type: