mneme - CLI tool that turns documents into a searchable second brain. Ingest once, query forever.
Project description
A CLI tool that turns your documents into a searchable second brain. Drop files in, get a structured knowledge layer out -- browsable by humans in Obsidian, queryable by machines in under 5ms.
pip install mneme-cli
mneme new ~/projects/my-research --name "My Research" --client acme-corp
cd ~/projects/my-research
mneme ingest proposal.pdf acme-corp
mneme search "delivery timeline"
One installed mneme CLI can serve many independent workspaces. Switch between them by cd-ing, exporting MNEME_HOME, or passing --workspace /path/to/ws.
That's it. Your knowledge compounds instead of decaying.
Why
You're building a medical device. You have a risk analysis in a PDF, user needs in a spreadsheet, meeting notes in markdown, and 47 requirements in a CSV. An auditor asks "show me the trace from hazard HAZ-001 to the test that verifies its mitigation." You spend two hours searching folders.
Mneme fixes this:
# Import everything
mneme ingest risk-analysis.pdf cardio-monitor
mneme ingest-csv user-needs.csv cardio-monitor --mapping user-needs
mneme ingest-csv risk-register.csv cardio-monitor --mapping risk-register
# Answer the auditor in 2 seconds
mneme trace show cardio-monitor/haz-001 --direction forward
# haz-001 (Electrical Shock)
# mitigated-by -> rma-003 (Insulation Barrier)
# implemented-by -> req-007 (Double Insulation)
# verified-by -> test-042 (Dielectric Strength Test)
# Find gaps before the auditor does
mneme trace gaps cardio-monitor
# Requirements with no verification: req-011, req-023
# Hazards with no mitigation: haz-009
Every document ingested once. Every trace link tracked. Every vocabulary term harmonized. Every gap found automatically.
No databases. No servers. No infrastructure. Plain markdown files + JSON schemas that any system can read.
Install
pip install mneme-cli
Or from source:
git clone https://github.com/tolism/mneme.git
cd mneme
pip install -e .
You now have the mneme command globally. Verify with mneme --help.
Optional: For PDF support, pip install "mneme-cli[pdf]". For everything, pip install "mneme-cli[all]".
Requirements: Python 3.9+. Works on macOS, Linux, Windows.
Quick Start
# Scaffold a new workspace (from anywhere)
mneme new ~/projects/my-project --name "My Project" --client client-a
cd ~/projects/my-project
# Ingest some documents
mneme ingest report.pdf client-a
mneme ingest meeting-notes.md client-a
# Search across everything
mneme search "quarterly budget"
# Check health
mneme stats
# Launch the web dashboard
python -m mneme.server # http://localhost:3141
Run mneme against any workspace
mneme --workspace ~/projects/parkiwatch stats # one-shot
export MNEME_HOME=~/projects/parkiwatch # sticky for the shell
mneme stats
One installed CLI serves many projects — each workspace is just a directory.
CLI
| Command | What It Does |
|---|---|
mneme new <dir> |
Scaffold a new workspace from the bundled template |
mneme init |
Scaffold a workspace in cwd (legacy) |
mneme --workspace <dir> |
Run any command against a specific workspace |
mneme ingest <file> <client> |
Ingest a source document |
mneme resync <file> <client> |
Re-ingest an updated source via 3-way merge, preserving hand edits |
mneme resync-resolve <client/page> |
Finalize a conflicted resync after editing out markers |
mneme search "<query>" |
Search across all layers |
mneme draft --doc-type <t> --section <s> --client <c> |
Build a write packet for an LLM agent to produce one section |
mneme validate writing-style <page> |
Build a review packet for an LLM agent to grade a page |
mneme tags suggest <page> |
Build a tag packet for an LLM agent to choose tags |
mneme tags apply <page> --add t1,t2 --remove t3 |
Atomic tag update (frontmatter + schema + search index) |
mneme tags bulk-suggest --client X --filter req- --limit 50 |
Build one bulk packet covering many pages |
mneme tags bulk-apply response.json |
Apply tag changes from an agent JSON response |
mneme entity suggest --client X |
Build an entity-classification packet for an LLM agent |
mneme entity apply --id <id> --type <type> |
Set one entity's type atomically |
mneme entity bulk-apply classifications.json |
Bulk classify many entities |
mneme home --client X / --all-clients |
Generate a HOME.md navigation hub (Dataview + fallback) |
mneme ingest-dir --recursive --preserve-structure |
Mirror source directory hierarchy into the wiki |
mneme agent plan --goal "..." --doc-type <t> --client <c> |
Generate a deterministic TODO plan from the active profile |
mneme agent next-task |
Return the next ready task in the active plan |
mneme agent task-done <id> |
Mark a task as done |
mneme sync |
Sync wiki pages to FTS5 search index |
mneme reindex |
Rebuild search index from wiki pages |
mneme drift |
Detect layer desynchronization |
mneme stats |
Health overview |
mneme repair |
Fix corrupted archives |
Formats: .md, .txt, .pdf, .xlsx (with pip install "mneme-cli[xlsx]")
For LLM agents
If you are an LLM agent driving mneme on a user's behalf — read AGENTS.md first. It is the canonical contract for the agent loop, the standard task templates (DVR, CER, risk file, resync, migration, pre-submission), the sub-agent spawning patterns, and the hard rules you must never violate.
The 30-second version of the agent loop:
# 1. Generate a plan from the active profile
mneme agent plan --goal "Produce a Design Validation Report" \
--doc-type design-validation-report \
--client tda
# 2. Walk the plan one task at a time
mneme agent next-task # returns a self-contained task envelope
# (do the work the envelope describes -- usually `mneme draft` or
# `mneme validate writing-style`, then write or grade prose)
mneme agent task-done section-context
# 3. Repeat until done
mneme agent next-task
# ...
# 4. Inspect progress at any time
mneme agent show
mneme agent list
Mneme generates the plan deterministically from the active profile's section_notes. Tasks have a dependency graph; next-task only returns ones whose dependencies are satisfied. The plan and per-task state are persisted under <workspace>/.mneme/agent-plans/ (gitignored). Mneme does not call any LLM — you (the agent) do the writing. Mneme assembles the contracts.
End-to-end example: from raw documents to a tagged, searchable, validated knowledge base
A realistic walkthrough showing how the human, the CLI, and the LLM agent collaborate. Suppose you're building a knowledge base for Parkiwatch, a medical device for Parkinson's monitoring.
Step 1 — Scaffold a workspace (human, one-time)
mneme new ~/projects/parkiwatch --name Parkiwatch --client parkiwatch --profile eu-mdr
cd ~/projects/parkiwatch
Creates the workspace tree, sets the EU MDR writing-style profile, and initializes empty schema files.
Step 2 — Ingest source material (human)
# Drop a folder of source documents into inbox/, then bulk-process
cp -r ~/Downloads/parkinson-research/* inbox/
mneme tornado --client parkiwatch
# Or ingest individual files
mneme ingest research-paper.pdf parkiwatch
mneme ingest-csv risk-register.csv parkiwatch --mapping risk-register
mneme ingest spec-table.xlsx parkiwatch # .xlsx renders sheets as markdown tables
mneme ingest-dir docs/ parkiwatch --recursive # walk subdirectories
What happens per ingest: source file → wiki page in wiki/parkiwatch/ → frontmatter with auto-extracted entities → entry in index.md → row in the FTS5 search DB → log entry.
Step 3 — Tag the new pages (LLM agent)
The new pages have only the auto-applied parkiwatch client tag. The agent now adds meaningful tags:
# For each new page, the agent runs:
mneme tags suggest parkiwatch/research-paper > /tmp/packet.md
The packet contains the page body, the current tag taxonomy (every tag in the workspace + usage counts), and a ready-to-paste prompt. The LLM reads the packet — it understands the content and decides on tags, preferring existing taxonomy entries when they fit. The LLM's response is JSON:
{"tags": ["clinical-trial", "iso-13485"], "new_tags": ["bradykinesia-detection"]}
The agent then runs:
mneme tags apply parkiwatch/research-paper \
--add clinical-trial,iso-13485,bradykinesia-detection
Atomic operation: rewrites the wiki page frontmatter, updates schema/tags.json, re-indexes the page in FTS5 (so search picks up the new tags immediately), appends a log entry. Repeat for every page — the taxonomy grows, and subsequent pages tend to reuse existing tags (consistency).
Step 4 — Search the knowledge base (anyone)
mneme search "bradykinesia" # BM25 + Porter stemming
mneme search "clinical evaluation" --client parkiwatch # client-scoped
Sub-millisecond. Returns the page title, snippet (with <b>highlights</b>), tags, and BM25 score.
Step 5 — Produce a regulatory deliverable (LLM agent driving the agent loop)
# Generate a deterministic plan from the active profile
mneme agent plan --goal "produce a Design Validation Report" \
--doc-type design-validation-report \
--client parkiwatch
# → 15 tasks: 11 section drafts + assemble + harmonize + review + submission-check
# Walk the plan
mneme agent next-task
# → Task: section-purpose-and-scope
# next_command: mneme draft --doc-type design-validation-report \
# --section purpose-and-scope --client parkiwatch
mneme draft --doc-type design-validation-report \
--section purpose-and-scope --client parkiwatch \
--query "purpose scope intended use" \
--out /tmp/write-packet.md
# The LLM reads /tmp/write-packet.md (which includes wiki search hits as evidence,
# the profile's writing-style rules, and a write prompt) and produces the section.
# The agent writes the section to wiki/parkiwatch/design-validation-report.md.
mneme agent task-done section-purpose-and-scope
# ... repeat for each section ...
# After all sections drafted:
mneme harmonize --client parkiwatch --fix # mechanical vocabulary swap
mneme validate writing-style parkiwatch/design-validation-report > /tmp/review.md
# The LLM reads /tmp/review.md, critiques every section, applies fixes in place
mneme agent task-done review-page
# Submission readiness
mneme validate consistency --client parkiwatch # cross-doc version checks
mneme trace gaps parkiwatch # find broken trace chains
mneme trace matrix parkiwatch --csv --out trace-matrix.csv # for the DHF
mneme snapshot parkiwatch # versioned audit zip
Who does what
| Layer | Responsibility |
|---|---|
| Human | Drops sources, runs commands, reviews diffs, ships the deliverable |
| mneme CLI | Deterministic infrastructure: parses files, builds packets, indexes, traces, harmonizes vocabulary, generates plans, atomic state updates |
| LLM agent | All reasoning: classifying entities, choosing tags, drafting prose, grading writing style, deciding when a chain is complete |
mneme never calls an LLM. The LLM never bypasses mneme's atomic operations. They meet at the packet boundary.
How It Works
Your Document
|
v
mneme ingest
|
+---> Wiki Layer (markdown, Obsidian-compatible)
| Frontmatter, citations, [[wikilinks]]
| You read and browse here
|
+---> Search Index (SQLite FTS5)
| BM25 ranking, Porter stemming
| Sub-millisecond queries, zero dependencies
|
+---> Schema Layer (JSON)
entities.json - people, companies, products
graph.json - relationships between entities
tags.json - taxonomy
Every mneme ingest writes the wiki page and updates the search index atomically. mneme drift catches desync. mneme reindex rebuilds the index from wiki pages.
Zero external dependencies for search. SQLite FTS5 is built into Python's stdlib — no install, no API key, no capacity limit.
Obsidian Integration
A mneme workspace is an Obsidian vault. The wiki pages use YAML frontmatter and [[wikilinks]], so Obsidian indexes everything natively.
Open a workspace as a vault:
- Open Obsidian → Open folder as vault → select your workspace directory (e.g.
~/projects/parkiwatch) - Obsidian creates
.obsidian/inside the workspace on first open — this is safe and mneme ignores it - Browse
wiki/in the file explorer; click any page to render with backlinks, graph view, and tag search
Recommended Obsidian settings:
- Files & Links → Default location for new notes:
wiki/{default-client}/ - Files & Links → New link format:
Relative path to file - Files & Links → Use [[Wikilinks]]: ON
- Files & Links → Detect all file extensions: OFF (keeps
sources/archive out of the graph)
Useful community plugins:
| Plugin | Why |
|---|---|
| Dataview | Query frontmatter: list all pages with type: hazard, confidence: low, etc. |
| Templater | Paste mneme page frontmatter from a snippet |
| Tag Wrangler | Visualise the same tags mneme tracks in schema/tags.json |
| Graph Analysis | See the entity relationships mneme builds in schema/graph.json |
Workflow:
# Ingest new docs from the CLI
mneme ingest meeting.pdf parkiwatch
# Obsidian auto-detects the new wiki page
# Read, link, and annotate in Obsidian
# mneme lint catches dead links on your next run
mneme lint
Sync the workspace via Dropbox, iCloud, or git and you have multi-device Obsidian + mneme.
Profiles (and custom profiles)
A profile defines the vocabulary and document structure rules for a regulatory framework. mneme ships two bundled profiles:
| Profile | Use when |
|---|---|
eu-mdr |
EU Medical Device Regulation (2017/745) -- 15 vocabulary rules, 6 section templates |
iso-13485 |
ISO 13485:2016 QMS -- 13 vocabulary rules, 6 section templates |
Activate one in any workspace with mneme profile set eu-mdr. From then on, mneme harmonize enforces vocabulary, mneme validate writing-style builds an LLM review packet for prose, and mneme validate consistency checks cross-document standard versions.
Adding your own profile
Profiles are just JSON files in <workspace>/profiles/. No reinstall, no rebuild, no PR to mneme. Drop a file in, activate it, you're done.
# 1. mneme new already creates the profiles/ folder for you
mneme new ~/projects/parkiwatch --name Parkiwatch --client parkiwatch
cd ~/projects/parkiwatch
# 2. Drop your profile in (use any text editor or this heredoc).
# Profiles are markdown with YAML frontmatter.
cat > profiles/parkiwatch-qms.md <<'EOF'
---
name: Parkiwatch QMS
description: Internal quality framework for the Parkiwatch product line
version: 1.0
tone: formal
voice: passive-for-procedures
trace_types: [derived-from, implemented-by, verified-by]
requirement_levels:
shall: mandatory
should: recommended
vocabulary:
- use: parking violation
reject: [parking ticket, infraction]
- use: enforcement officer
reject: [meter maid, warden]
---
# Principles
- Be specific. Cite the policy clause.
- Auditable: every claim must trace to a controlled record.
# Terminology
| Use | Instead of | Why |
|---|---|---|
| parking violation | parking ticket, infraction | Internal Parkiwatch convention. |
# Document Type: incident-report
Standard parking incident structure used by all enforcement officers.
## Section: evidence
Photo evidence with timestamp and GPS coordinates is mandatory.
EOF
# 3. Activate and verify
mneme profile set parkiwatch-qms
mneme profile show
# Active profile: Parkiwatch QMS
# 4. Use it
mneme harmonize parkiwatch # flag "parking ticket" -> should be "parking violation"
mneme harmonize parkiwatch --fix # auto-fix vocabulary
mneme validate writing-style parkiwatch/incident-001 > review.md # paste into Claude
How resolution works
When you run mneme profile set <name>, mneme looks in two places, in order:
- First:
<workspace>/profiles/<name>.md(your local profile) - Then:
<installed-mneme>/profiles/<name>.md(the bundledeu-mdr/iso-13485)
The first one wins. So you can:
- Add a brand-new framework mneme doesn't ship -- just give it a unique name (e.g.
parkiwatch-qms.md,acme-internal.md) - Override a bundled framework with project-specific tweaks -- create your own
eu-mdr.mdin the workspace and it shadows the bundled one for that project only
The same shadowing rule applies to CSV column mappings under <workspace>/profiles/mappings/, used by mneme ingest-csv. Mappings are still JSON because they are programmatic, not prose.
If neither file exists, you get a clear error listing both paths it checked.
What goes into a profile
A profile is a markdown file with YAML frontmatter. The frontmatter carries the structured fields (vocabulary, trace_types, tone, etc.) and the body carries the writing-style prose under recognized H1 headings.
| Frontmatter field | What it does | Used by |
|---|---|---|
name, description, version |
Display metadata | mneme profile show |
vocabulary[].use / .reject[] |
Terminology swaps | mneme harmonize (mechanical) |
requirement_levels |
Reserved words (shall, should, may) |
Documentation |
trace_types |
Allowed relationship types for trace links | Documentation |
tone, voice, citation_style |
Style hints | mneme profile show |
placeholder_for_missing_refs |
Marker token (e.g. [TO ADD REF]) |
LLM agent |
| Body H1 heading | What it becomes |
|---|---|
# Principles |
Top-level principles (bullets) |
# General Rules |
Cross-cutting writing rules (bullets) |
# Terminology |
A 3-column markdown table: Use / Instead of / Why |
# Framing: <context> |
One worked example: Wrong: / Correct: / Why: blocks |
# Document Type: <slug> |
A document type description; nested ## Section: <slug> blocks become per-section guidance |
# Submission Checklist |
Pre-submission go/no-go items (bullets) |
Important: profiles do NOT enforce a list of required headings. Mechanical heading checks were removed because they don't reflect what regulatory reviewers actually care about. Instead, use mneme validate writing-style <page> to build a review packet that an LLM agent grades against the full style guide.
See EXAMPLES.md Example 13 for a full walkthrough with a real Parkiwatch scenario. The bundled eu-mdr.md and iso-13485.md profiles inside the installed package are good starting templates -- copy one and edit it.
Web Dashboard
python -m mneme.server -- opens at http://localhost:3141
- Dashboard -- stats, per-client counts, activity log
- Search -- dual-layer results with source attribution
- Wiki -- browse all pages with rendered markdown
- Entities -- filterable table of extracted entities
- Health -- drift status, sync state
When You Need This
| Scale | Search performance |
|---|---|
| 5 docs | Sub-millisecond |
| 50 docs | Sub-millisecond |
| 500 docs | Sub-millisecond, BM25 ranked |
| 5,000 docs | A few ms, still ranked by relevance |
| 50,000 docs | Tens of ms |
SQLite FTS5 scales transparently. No tuning, no capacity limits.
Project Structure
mneme/
sources/ Raw documents (immutable, never modified)
wiki/ Markdown knowledge pages (Obsidian-compatible)
schema/ entities.json, graph.json, tags.json
search.db SQLite FTS5 search index
core.py Engine (ingest, search, sync, drift, repair)
config.py Configuration
server.py Web dashboard
index.md Master page catalog
log.md Activity timeline
Downstream Use
Mneme outputs plain files -- markdown and JSON. Any system can read them. The CLI is designed to be called programmatically by other applications.
Next up: Mneme as the knowledge backend for a QMS (Quality Management System) -- quality documentation, audit trails, compliance evidence, all searchable.
Releasing (maintainers)
Mneme ships to PyPI as mneme. To cut a new release:
# 1. Bump the version in mneme/__init__.py and pyproject.toml
# 2. Install release tooling
pip install -e ".[release]"
# 3. Dry run to TestPyPI first
scripts/release.sh test # bash (macOS/Linux/WSL)
scripts\release.ps1 test # PowerShell (Windows)
pip install --index-url https://test.pypi.org/simple/ \
--extra-index-url https://pypi.org/simple/ mneme
# 4. Production
scripts/release.sh prod # bash
scripts\release.ps1 prod # PowerShell
The script cleans dist/, runs python -m build, validates with twine check, and uploads.
You'll need a PyPI API token in ~/.pypirc:
[distutils]
index-servers =
pypi
testpypi
[pypi]
username = __token__
password = pypi-AgEI... # from https://pypi.org/manage/account/token/
[testpypi]
repository = https://test.pypi.org/legacy/
username = __token__
password = pypi-AgENd... # from https://test.pypi.org/manage/account/token/
Credits
This project builds on two foundational ideas:
- LLM Wiki pattern by Andrej Karpathy -- the insight that LLMs should build and maintain a persistent, compounding wiki instead of re-deriving answers from raw documents on every query
- SQLite FTS5 -- the world's most-deployed embedded database, with built-in BM25 full-text search
- Original implementation -- tashisleepy/knowledge-engine -- the first version that fused both patterns into a dual-layer bridge
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mneme_cli-0.5.1.tar.gz.
File metadata
- Download URL: mneme_cli-0.5.1.tar.gz
- Upload date:
- Size: 194.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d7617716230ded4f68c09b6edead86b9fb8e6ad7bd6489d3237687d71ea03db
|
|
| MD5 |
a2190a69efb14ff43f167049deaa46d3
|
|
| BLAKE2b-256 |
41086deb29545bb74630ef15f136b0344e262f7b1175b6d5fa6f5734f22a1885
|
Provenance
The following attestation bundles were made for mneme_cli-0.5.1.tar.gz:
Publisher:
release.yml on tolism/mneme
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mneme_cli-0.5.1.tar.gz -
Subject digest:
2d7617716230ded4f68c09b6edead86b9fb8e6ad7bd6489d3237687d71ea03db - Sigstore transparency entry: 1298737496
- Sigstore integration time:
-
Permalink:
tolism/mneme@63cd23c842358151e40235fa3728807c1d83cf42 -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/tolism
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@63cd23c842358151e40235fa3728807c1d83cf42 -
Trigger Event:
push
-
Statement type:
File details
Details for the file mneme_cli-0.5.1-py3-none-any.whl.
File metadata
- Download URL: mneme_cli-0.5.1-py3-none-any.whl
- Upload date:
- Size: 126.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
84a05cb16dbfdb54705f852ad6dbcad684cafe629278bf6c5a613135986e165c
|
|
| MD5 |
fb6a76542a6cbc860a3f60ef5adccdf4
|
|
| BLAKE2b-256 |
96fad999605357221c9aaec740cd4ffbfd70c9ea20d327ac213a1d5c810ca673
|
Provenance
The following attestation bundles were made for mneme_cli-0.5.1-py3-none-any.whl:
Publisher:
release.yml on tolism/mneme
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mneme_cli-0.5.1-py3-none-any.whl -
Subject digest:
84a05cb16dbfdb54705f852ad6dbcad684cafe629278bf6c5a613135986e165c - Sigstore transparency entry: 1298737588
- Sigstore integration time:
-
Permalink:
tolism/mneme@63cd23c842358151e40235fa3728807c1d83cf42 -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/tolism
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@63cd23c842358151e40235fa3728807c1d83cf42 -
Trigger Event:
push
-
Statement type: