Skip to main content

Local, token-free file digestion → knowledge-graph memory for Claude. Convert any attachment to Markdown and digest it into a graph + exportable memory, entirely on-device. Tuned for Apple silicon.

Project description

Memorised them All

Memorised them All

Convert any attachment to Markdown and digest it into token-free knowledge-graph memory for Claude — 100% locally.

CI Release PyPI License: MIT Apple silicon Token cost

by GRU-953 · Claude Desktop + Claude Code · free & open-source · runs entirely on your machine


The idea. Claude tokens are expensive; your Mac's compute is free. So every heavy step — converting documents, extracting knowledge, building the graph, embedding, summarising — runs locally. Claude only ever issues a tiny tool call and gets back compact metadata or a small, relevant slice — never whole documents. Digesting a 500-page folder costs roughly zero context tokens.

Point it at a folder. It converts every attachment to Markdown, then digests it into a layered knowledge graph — a global synopsis, per-theme summaries, per-document notes, an exportable Markdown bundle, and an offline interactive mind map — and lets Claude recall from it for next to nothing.

Contents

Why it's token-free · What you get · How it works · Install · Use it · Tools · Configuration · Apple silicon · Privacy · FAQ · Acknowledgements

Why it's token-free

Most "chat with your docs" tools stream document text into the model — you pay tokens to ingest and to recall. Memorised them All never does that:

Step Where it runs Tokens to Claude
Convert PDF/Office/image/audio → Markdown your Mac (MarkItDown, Tesseract, Whisper, Ollama) 0
Extract entities · relations · facts your Mac (local LLM, classical fallback) 0
Embed · resolve · build graph · summarise themes your Mac (Ollama + NetworkX) 0
digest tool result only counts & paths
recall tool result a small, relevant slice (not documents)

What you get

  • 📄 Universal local conversion — PDF, Word, Excel, PowerPoint, HTML, EPub, Outlook .msg, CSV/JSON/XML, images (OCR + vision captioning), audio (on-device transcription) → clean Markdown.
  • 🕸️ A layered knowledge graph — entities, typed relations and atomic facts, grouped into communities (themes) with local summaries. Three memory layers: global synopsis → theme summaries → provenance-tracked facts.
  • 📝 Exportable Markdown memorymemory.md, one note per source document, and graph.json — copy them anywhere.
  • 🧭 Offline interactive mind map — a single self-contained mindmap.html (Cytoscape inlined, no network) you can open in any browser.
  • 🔁 Reusable, named projects — keep separate memories per body of work.
  • ⚙️ Auto-installing & auto-updating — pulls the latest MarkItDown from upstream and keeps dependencies current. Starts the model server on demand and stops it after 5 minutes idle.

How it works

attachments ─► CONVERT ─► SEGMENT ─► EMBED ─► EXTRACT ─► RESOLVE ─► GRAPH + COMMUNITIES ─► MATERIALISE
  pdf/docx/   MarkItDown  structure  nomic-   local LLM  canonical  NetworkX +             graph.json
  xlsx/img/   +Tesseract  +semantic  embed-   triples +  entities   Leiden / Louvain        memory.md
  audio/...   +Whisper    chunking   text     facts      (embed +   community summaries     memory/<doc>.md
              +Ollama                          (+class-   fuzzy)     (local LLM)             mindmap.html
               vision                           ical                                         vectors store
                                                fallback)
                                                                              │
   recall("…")  ◄── embed query locally ──  return ONLY a small relevant slice (themes + facts + provenance)

Everything between attachments and recall happens on your machine. The local LLM step has a dependency-free classical fallback, so a digest always succeeds — even offline, even before any model is downloaded — and gets sharper once the models are present.

Install

Requirements: macOS (Apple silicon recommended) · Homebrew · Python ≥ 3.10. The installer fetches everything else for you.

Claude Desktop (one click)

  1. Download memorised-them-all.mcpb from the latest release.
  2. Double-click it (or Settings → Extensions → Install).
  3. On first launch it bootstraps the local stack automatically.

Claude Code (CLI)

/plugin marketplace add GRU-953/memorised-them-all
/plugin install memorised-them-all

As a plain CLI / from PyPI

pip install memorised-them-all      # installs the `mta` command
mta status                          # check the local stack
mta digest ~/Documents/research     # build memory from a folder

From source / Homebrew

git clone https://github.com/GRU-953/memorised-them-all
cd memorised-them-all && ./install.sh        # idempotent: brew apps + venv + models

# or via the tap
brew install GRU-953/memorised-them-all/mta

The installer adds (only what's missing): Ollama, Tesseract (+ all OCR languages), ffmpeg, a Python virtualenv with the latest MarkItDown from upstream, and the local models qwen2.5:7b, nomic-embed-text, moondream (~6–8 GB, configurable — pulled in the background).

Use it

In Claude (Desktop or Code), just ask:

"Memorise everything in ~/Documents/grant-proposals." "What did my documents say about the Aurora timeline?" "Open the mind map." · "Export the memory to ~/Desktop/aurora-memory."

Or use the slash commands: /memorise, /recall, /memory-map, /memory-status, /export-memory.

From the terminal:

mta digest ~/Documents/research --project aurora
mta recall "who leads the project and who are the partners?" --project aurora
mta overview --project aurora
mta mindmap --project aurora --open
mta export ~/Desktop/aurora-memory --project aurora
mta update            # pull the latest MarkItDown + dependencies

MCP tools

Tool What it does Returns
digest(paths, project?, reset?) convert + digest files/dirs/globs (accumulates into the project; reset=true starts fresh) counts, paths, graph stats
recall(query, project?, k?) answer from memory a small, citable slice
memory_overview(project?) synopsis + themes compact overview
export_memory(dest, project?) export portable Markdown files written
list_digestible(directory) list convertible files paths + sizes
memory_status() local stack health versions, models, projects
open_mindmap(project?) offline mind map file path

Every result is metadata or a small slice — document contents never return to the conversation.

Configuration

All optional; sensible defaults. Set via environment (CLI) or the extension settings (Desktop).

Variable Default Meaning
MTA_HOME ~/.memorised-them-all where memories are stored
MTA_EXTRACT_MODEL qwen2.5:7b local LLM for extraction & summaries
MTA_EMBED_MODEL nomic-embed-text local embedding model
MTA_VISION_MODEL moondream image captioning
MTA_OCR_LANG eng Tesseract languages, e.g. eng+ben
MTA_WHISPER_MODEL base on-device transcription model
MTA_IDLE 300 seconds of idle before Ollama is stopped
MTA_WORKERS 0 (auto) parallel conversion workers
MTA_EXTRACT_WORKERS 0 (auto) parallel extraction workers (memory-aware: 1–3 by RAM)
MTA_MAX_CHUNKS 1500 safety cap on chunks per digest (truncation is reported)
MTA_MAX_FILE_MB 200 skip files larger than this before reading (0 disables)
MTA_COMMUNITY_ALGO auto leiden · louvain · greedy
MTA_AUTO_UPDATE on auto-update MarkItDown & dependencies
MTA_FAST off fast mode — skip the LLM (classical extraction, deterministic, keeps embeddings)
MTA_NO_OLLAMA unset hard offline switch (classical + hashing)

Accuracy vs speed. The default path uses the local LLM for the highest extraction accuracy. Fast mode (MTA_FAST=on, mta digest --fast, or the fast=true tool arg) skips the LLM for a fully deterministic, ~100× faster digest that still builds the graph and keeps semantic recall — ideal for large or frequently-updated corpora.

Apple silicon first

  • Conversion fans out across performance cores (hw.perflevel0.physicalcpu), with each worker's native math libraries pinned to one thread to avoid oversubscription on the unified-memory architecture.
  • GPU-accelerated Whisper via Apple MLX (mlx-whisper), with a faster-whisper CPU fallback.
  • Worker count is unified-memory-aware so it won't thrash a 16 GB Mac.

It runs on Intel Macs and Linux too — those paths simply use portable defaults.

Privacy

100% local. No cloud APIs, no telemetry, no API keys. Your documents, the graph, the embeddings, and the memory files never leave your machine. The only network access is (a) downloading open-source dependencies/models on install and (b) the once-a-day dependency update check (disable with MTA_AUTO_UPDATE=off).

FAQ

Does it really cost no tokens? Conversion and digestion cost zero Claude tokens. recall returns a small slice (a handful of summaries/facts), so answers are cheap — far cheaper than pasting documents into the chat.

What if I have no GPU / no models / I'm offline? It still works. The classical extractor and hashing embeddings keep the pipeline running; quality improves once Ollama and the models are available.

Is my existing Ollama affected? No. If Ollama is already running, it's reused and left alone. Only an instance this tool starts is stopped on idle.

Where are my files? Under MTA_HOME/projects/<project>/graph.json, memory.md, memory/, mindmap.html. export_memory copies them anywhere.

Modes & performance

Two digest modes — the default favours accuracy & consistency, fast mode favours speed & determinism:

Default (accurate) Fast (--fast / MTA_FAST=on)
Extraction local LLM (qwen2.5) classical (deterministic)
Theme summaries local LLM deterministic fact-join
Embeddings / recall local (nomic) local (nomic)
Reproducible per-model byte-identical across runs
Relative speed baseline ~100× faster
Best for highest fidelity large or frequently-refreshed corpora

Both are token-free and fully local. Digestion is incremental — pointing digest at another folder extends the same project; reset=true starts fresh. Degenerate/repetitive content is de-duplicated and a reported MTA_MAX_CHUNKS cap keeps even pathological inputs bounded.

Platform support

Apple M-series is the primary, most-optimised target. Other platforms are supported with portable fallbacks:

Platform Status Notes
macOS (Apple silicon) ✅ optimised performance-core pool, MLX GPU Whisper, unified-memory-aware
macOS (Intel) ✅ supported physical-core sizing via psutil, CPU Whisper
Linux ✅ supported apt/dnf/pacman install paths, CUDA Whisper if a GPU is present
Windows 🧪 experimental pip install memorised-them-all then mta serve (or python launch.py from a clone). The one-click .mcpb bundle is macOS/Linux only (its launcher is bash); on Windows use pip.

CI runs the offline test suite across Ubuntu, macOS, and Windows on Python 3.10 & 3.12.

Generated files & reuse

Each project under MTA_HOME/projects/<name>/ is self-contained and portable:

File What it is
graph.json source of truth — nodes, edges, communities, layered summaries, stats (version-stamped; stores basenames, no absolute paths)
memory.md compact, layered digest for reading / pasting
memory/<doc>.md one note per source document
mindmap.html offline interactive graph (Cytoscape inlined)
vectors.npz + vectors.json local embeddings for recall

A memory built once can be copied to another machine and reused read-only — recall and the mind map work with no rebuild. export_memory bundles all of the above (including the vector store) into a folder you choose.

Quality & testing

This project is exercised hard: a multi-format corpus (Office, PDF, scanned PDF, OCR images, audio), 14 regression tests (determinism, token-safety, fact attribution, accumulation, OCR, lifecycle, cross-platform), green CI on three OSes, and a multi-agent review pass covering accuracy, reliability, token-safety, reusability, cross-platform, and security. The token-free guarantee is enforced (recall slices are hard-capped) and the digest never returns document contents to the model.

Security & threat model

Memorised them All processes files you point it at — including, potentially, untrusted documents. It is hardened accordingly:

  • No shell injection / no curl | sh: all subprocesses use argv lists; the optional installer downloads to a temp file before executing.
  • Path-safe outputs: converted filenames are sanitised; same-named files in different folders get unique names (no silent overwrite); exports write only the memory artifacts to the destination you choose.
  • Bounded inputs: per-file size cap (MTA_MAX_FILE_MB), a reported chunk cap, and a decompression-bomb guard for archives (size, ratio, and nested-archive rejection).
  • Prompt-injection aware: extracted document text is wrapped as data in the local-LLM prompt. Note that theme/synopsis summaries are model output over your documents — treat them as you would any generated text. Recall results are hard-capped in size so a verbose or adversarial summary cannot bloat context.
  • No deserialization risk: JSON only; numpy loads with allow_pickle=False.
  • Local-only egress: the only network calls are localhost Ollama, dependency installs, and a throttled once-a-day GitHub update check (opt out with MTA_AUTO_UPDATE=off).

Manage projects with mta forget --project <name> (or the forget tool) to delete a memory; the .mcpb one-click bundle targets macOS/Linux (Windows uses pip install + mta serve).

Acknowledgements

Built on the shoulders of excellent open-source work — see ACKNOWLEDGEMENTS.md. In particular: Microsoft MarkItDown, Ollama, Tesseract, OpenAI Whisper / faster-whisper / Apple MLX, NetworkX, Leiden / igraph, and Cytoscape.js. Design inspiration from graphify and the author's own markitdown-mcp and mnemo-mcp.

License

MIT © 2026 Aninda Sundar Howlader (GRU-953).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memorised_them_all-1.3.1.tar.gz (177.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memorised_them_all-1.3.1-py3-none-any.whl (176.0 kB view details)

Uploaded Python 3

File details

Details for the file memorised_them_all-1.3.1.tar.gz.

File metadata

  • Download URL: memorised_them_all-1.3.1.tar.gz
  • Upload date:
  • Size: 177.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for memorised_them_all-1.3.1.tar.gz
Algorithm Hash digest
SHA256 6ba0e67c9256d484dc16e15e326f361898c41d9087050303ca2f4fa5219dddb6
MD5 7df0da4168e75f2e54980244d05c707b
BLAKE2b-256 93d93e67a833c86a4a687f34370e029f004b4e4edec57b9474650928c02f05f2

See more details on using hashes here.

File details

Details for the file memorised_them_all-1.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for memorised_them_all-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1eeada2d20037398923f09ee6a045e973e05e920203e2c4012848f6def742129
MD5 029836c9521da6cca31e0b989aa038dd
BLAKE2b-256 43c7df911d3bc37456a7abd13220f769f648a4f8fc883e47ea9d0a48b992e759

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page