Skip to main content

Local, token-free file digestion → knowledge-graph memory for Claude. Convert any attachment to Markdown and digest it into a graph + exportable memory, entirely on-device. Tuned for Apple silicon.

Project description

Memorised them All

Memorised them All

Convert any attachment to Markdown and digest it into token-free knowledge-graph memory for Claude — 100% locally.

CI Release PyPI License: MIT Apple silicon Token cost

by GRU-953 · Claude Desktop + Claude Code · free & open-source · runs entirely on your machine


The idea. Claude tokens are expensive; your Mac's compute is free. So every heavy step — converting documents, extracting knowledge, building the graph, embedding, summarising — runs locally. Claude only ever issues a tiny tool call and gets back compact metadata or a small, relevant slice — never whole documents. Digesting a 500-page folder costs roughly zero context tokens.

Point it at a folder. It converts every attachment to Markdown, then digests it into a layered knowledge graph — a global synopsis, per-theme summaries, per-document notes, an exportable Markdown bundle, and an offline interactive mind map — and lets Claude recall from it for next to nothing.

Contents

Why it's token-free · What you get · How it works · Install · Use it · Tools · Configuration · Apple silicon · Privacy · FAQ · Acknowledgements

Why it's token-free

Most "chat with your docs" tools stream document text into the model — you pay tokens to ingest and to recall. Memorised them All never does that:

Step Where it runs Tokens to Claude
Convert PDF/Office/image/audio → Markdown your Mac (MarkItDown, Tesseract, Whisper, Ollama) 0
Extract entities · relations · facts your Mac (local LLM, classical fallback) 0
Embed · resolve · build graph · summarise themes your Mac (Ollama + NetworkX) 0
digest tool result only counts & paths
recall tool result a small, relevant slice (not documents)

What you get

  • 📄 Universal local conversion — PDF, Word, Excel, PowerPoint, HTML, EPub, Outlook .msg, CSV/JSON/XML, images (OCR + vision captioning), audio (on-device transcription) → clean Markdown.
  • 🕸️ A layered knowledge graph — entities, typed relations and atomic facts, grouped into communities (themes) with local summaries. Three memory layers: global synopsis → theme summaries → provenance-tracked facts.
  • 📝 Exportable Markdown memorymemory.md, one note per source document, and graph.json — copy them anywhere.
  • 🧭 Offline interactive mind map — a single self-contained mindmap.html (Cytoscape inlined, no network) you can open in any browser.
  • 🔁 Reusable, named projects — keep separate memories per body of work.
  • ⚙️ Auto-installing & auto-updating — pulls the latest MarkItDown from upstream and keeps dependencies current. Starts the model server on demand and stops it after 5 minutes idle.

How it works

attachments ─► CONVERT ─► SEGMENT ─► EMBED ─► EXTRACT ─► RESOLVE ─► GRAPH + COMMUNITIES ─► MATERIALISE
  pdf/docx/   MarkItDown  structure  nomic-   local LLM  canonical  NetworkX +             graph.json
  xlsx/img/   +Tesseract  +semantic  embed-   triples +  entities   Leiden / Louvain        memory.md
  audio/...   +Whisper    chunking   text     facts      (embed +   community summaries     memory/<doc>.md
              +Ollama                          (+class-   fuzzy)     (local LLM)             mindmap.html
               vision                           ical                                         vectors store
                                                fallback)
                                                                              │
   recall("…")  ◄── embed query locally ──  return ONLY a small relevant slice (themes + facts + provenance)

Everything between attachments and recall happens on your machine. The local LLM step has a dependency-free classical fallback, so a digest always succeeds — even offline, even before any model is downloaded — and gets sharper once the models are present.

Install

Requirements: macOS (Apple silicon recommended) · Homebrew · Python ≥ 3.10. The installer fetches everything else for you.

Claude Desktop (one click)

  1. Download memorised-them-all.mcpb from the latest release.
  2. Double-click it (or Settings → Extensions → Install).
  3. On first launch it bootstraps the local stack automatically.

Claude Code (CLI)

/plugin marketplace add GRU-953/memorised-them-all
/plugin install memorised-them-all

As a plain CLI / from PyPI

pip install memorised-them-all      # installs the `mta` command
mta status                          # check the local stack
mta digest ~/Documents/research     # build memory from a folder

From source / Homebrew

git clone https://github.com/GRU-953/memorised-them-all
cd memorised-them-all && ./install.sh        # idempotent: brew apps + venv + models

# or via the tap
brew install GRU-953/memorised-them-all/mta

The installer adds (only what's missing): Ollama, Tesseract (+ all OCR languages), ffmpeg, a Python virtualenv with the latest MarkItDown from upstream, and the local models qwen2.5:7b, nomic-embed-text, moondream (~6–8 GB, configurable — pulled in the background).

Use it

In Claude (Desktop or Code), just ask:

"Memorise everything in ~/Documents/grant-proposals." "What did my documents say about the Aurora timeline?" "Open the mind map." · "Export the memory to ~/Desktop/aurora-memory."

Or use the slash commands: /memorise, /recall, /memory-map, /memory-status, /export-memory.

From the terminal:

mta digest ~/Documents/research --project aurora
mta recall "who leads the project and who are the partners?" --project aurora
mta overview --project aurora
mta mindmap --project aurora --open
mta export ~/Desktop/aurora-memory --project aurora
mta update            # pull the latest MarkItDown + dependencies

MCP tools

Tool What it does Returns
digest(paths, project?, reset?) convert + digest files/dirs/globs counts, paths, graph stats
recall(query, project?, k?) answer from memory a small, citable slice
memory_overview(project?) synopsis + themes compact overview
export_memory(dest, project?) export portable Markdown files written
list_digestible(directory) list convertible files paths + sizes
memory_status() local stack health versions, models, projects
open_mindmap(project?) offline mind map file path

Every result is metadata or a small slice — document contents never return to the conversation.

Configuration

All optional; sensible defaults. Set via environment (CLI) or the extension settings (Desktop).

Variable Default Meaning
MTA_HOME ~/.memorised-them-all where memories are stored
MTA_EXTRACT_MODEL qwen2.5:7b local LLM for extraction & summaries
MTA_EMBED_MODEL nomic-embed-text local embedding model
MTA_VISION_MODEL moondream image captioning
MTA_OCR_LANG eng Tesseract languages, e.g. eng+ben
MTA_WHISPER_MODEL base on-device transcription model
MTA_IDLE 300 seconds of idle before Ollama is stopped
MTA_WORKERS 0 (auto) parallel conversion workers
MTA_COMMUNITY_ALGO auto leiden · louvain · greedy
MTA_AUTO_UPDATE on auto-update MarkItDown & dependencies
MTA_NO_OLLAMA unset hard offline switch (classical + hashing)

Apple silicon first

  • Conversion fans out across performance cores (hw.perflevel0.physicalcpu), with each worker's native math libraries pinned to one thread to avoid oversubscription on the unified-memory architecture.
  • GPU-accelerated Whisper via Apple MLX (mlx-whisper), with a faster-whisper CPU fallback.
  • Worker count is unified-memory-aware so it won't thrash a 16 GB Mac.

It runs on Intel Macs and Linux too — those paths simply use portable defaults.

Privacy

100% local. No cloud APIs, no telemetry, no API keys. Your documents, the graph, the embeddings, and the memory files never leave your machine. The only network access is (a) downloading open-source dependencies/models on install and (b) the once-a-day dependency update check (disable with MTA_AUTO_UPDATE=off).

FAQ

Does it really cost no tokens? Conversion and digestion cost zero Claude tokens. recall returns a small slice (a handful of summaries/facts), so answers are cheap — far cheaper than pasting documents into the chat.

What if I have no GPU / no models / I'm offline? It still works. The classical extractor and hashing embeddings keep the pipeline running; quality improves once Ollama and the models are available.

Is my existing Ollama affected? No. If Ollama is already running, it's reused and left alone. Only an instance this tool starts is stopped on idle.

Where are my files? Under MTA_HOME/projects/<project>/graph.json, memory.md, memory/, mindmap.html. export_memory copies them anywhere.

Acknowledgements

Built on the shoulders of excellent open-source work — see ACKNOWLEDGEMENTS.md. In particular: Microsoft MarkItDown, Ollama, Tesseract, OpenAI Whisper / faster-whisper / Apple MLX, NetworkX, Leiden / igraph, and Cytoscape.js. Design inspiration from graphify and the author's own markitdown-mcp and mnemo-mcp.

License

MIT © 2026 Aninda Sundar Howlader (GRU-953).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memorised_them_all-1.0.1.tar.gz (160.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memorised_them_all-1.0.1-py3-none-any.whl (165.5 kB view details)

Uploaded Python 3

File details

Details for the file memorised_them_all-1.0.1.tar.gz.

File metadata

  • Download URL: memorised_them_all-1.0.1.tar.gz
  • Upload date:
  • Size: 160.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for memorised_them_all-1.0.1.tar.gz
Algorithm Hash digest
SHA256 db56dcac2e552e8f04d03c1cc2e96a5cb80b8e4b86505ba038b794bf716e2b25
MD5 52351007be0fccee92638dc65f8834e4
BLAKE2b-256 216b2fd2eaadddd9d3fd39b9fe4ab694931ee00ac420b322e4289521ff05d09d

See more details on using hashes here.

File details

Details for the file memorised_them_all-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for memorised_them_all-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 df9f245f6407217ab15cc98b2aff4a65c82010674c6dfbf21334dd8bf0dfeae9
MD5 fba9f6677b6cd145a2055863886b382f
BLAKE2b-256 e3f93532d816d495585b073b49b3a428a1c4337c47701c4ffede95efbb0203fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page