Local-first Markdown + SQLite memory for LLM agents, with keyword search, optional embeddings, MCP, and bounded citation tools.

These details have not been verified by PyPI

Project description

Vault-for-LLM

English | 繁體中文 | 简体中文

Local-first, production-minded memory workflows for LLM agents.

Vault-for-LLM turns Markdown project knowledge into a portable SQLite memory vault that agents can search on demand. It is built for the boring parts that make agent memory usable in real projects: retrieval QA, bounded document reads, semantic search, schema migrations, and verified backup/restore.

Why this exists

LLM agents are powerful, but most of them forget the things that matter between sessions: project decisions, repeated mistakes, user preferences, debugging history, and hard-won operational knowledge.

Vault-for-LLM gives an agent a simple local memory layer:

You write knowledge as Markdown.
vault compile stores it in a local SQLite database.
Agents search it only when needed, instead of stuffing everything into every prompt.
MCP-compatible agents can query the vault during a conversation.

The goal is not to replace your notes app or become another hosted vector database. The goal is to make your project knowledge usable, measurable, and recoverable by agents.

What makes it different

Vault-for-LLM is not just another vector store. It is evolving into an agent memory QA layer:

Can the agent find the right memory when it needs it?
Can it read only the relevant section instead of dumping whole documents into context?
Can it tell whether a knowledge entry is complete, stale, duplicated, or under-specified?
Can teams measure search quality before and after changing retrieval logic?
Can reusable agent workflows be shared as skills instead of rediscovered in every project?

In other words: regular RAG focuses on retrieval; Vault-for-LLM focuses on whether memory can be used correctly by agents.

The roadmap is bigger than one agent runtime: Vault-for-LLM is the memory core, agent systems such as Hermes Agent, OpenClaw, Claude Code, Codex, n8n, robots, cars, or smart-home agents are the hands, and models are the compute tools. See the vision document.

It is also designed around the user, not only around documents. A Vault can keep project knowledge, but it can also support a reviewed user profile: durable work preferences, communication boundaries, recent context summaries, and long-term patterns that help agents collaborate without starting from zero every session. Those profile memories should stay governed: raw private interactions stay private, while trusted agents may share short reviewed summaries.

For more automated setups, a user can run one or two dedicated memory agents:

Profile agent — maintains the user's stable profile, preferences, care summaries, and agent-specific boundaries without exposing raw private chats.
Dream / forgetting agent — runs periodic memory cleanup, deduplicates, marks stale memories, suggests promotion or archival, and helps the database forget low-value or expired context.

This makes Vault useful beyond today's chat agents. The same governed memory shape can later support embodied agents, long-running assistants, or world-model workflows that need user context, project state, source-grounded knowledge, and safe forgetting in one inspectable local-first layer.

The design principle is Progressive Memory Disclosure: agents first see a small safe summary, then topic maps, search candidates, bounded source ranges, and only then raw or archived memory when the task and permissions justify it. That is how the vault can stay useful as it grows into a lifelong memory store.

For a broader positioning against Mem0, Letta/MemGPT, Zep, and LangGraph memory, see the memory system comparison. The short version: Vault-for-LLM optimizes for local, inspectable, candidate-first project memory with retrieval QA and bounded citations; hosted or runtime-native memory systems may be better when you need managed personalization, a full stateful-agent runtime, or enterprise temporal graph infrastructure.

For adjacent retrieval and context-budget systems, see the PageIndex and Headroom comparison. The short version: Vault can borrow PageIndex-style document tree navigation and Headroom-style context budgeting while keeping core project memory local, governed, and source-cited.

To see the positioning as local numbers rather than slogans, run the project memory proof demos: agent onboarding recall, candidate-first review, and stale-source bounded-read checks. To compare exported Hermes/Codex-style sessions against governed Vault memory, use the agent onboarding benchmark.

Core principles

Local by default — SQLite is the source of truth. No cloud is required for core usage.
Works without embeddings — keyword search works first; semantic search is optional.
Agent-oriented memory — split always-needed facts from searchable deep knowledge.
Governed reads — shared agents can pass agent_id and sensitivity caps so private/restricted memory is filtered before bounded reads.
Bounded retrieval — Document Map tools help agents read the right section instead of dumping entire files into context.
Optional sync — Supabase support is an optional sync/read target, not required infrastructure.
CLI-first — this is a developer-facing tool. Core local usage is stable; advanced QA, semantic, and sync workflows still evolve.

Works across agent systems

Vault-for-LLM is not tied to one agent runtime. The shared contract is simple: local Markdown + SQLite, exposed through CLI and optional stdio MCP.

System	How to use Vault-for-LLM
Hermes Agent / Nancy	Configure `vault-mcp` for search/read/propose tools; run CLI jobs for dream reports, backups, and onboarding benchmarks.
OpenClaw	Use the bundled adapter in `integrations/openclaw/` to register `vault_search`, `vault_read_range`, `vault_memory_propose`, and `vault_stats`; generic MCP also works.
n8n	Call the `vault` CLI from Execute Command nodes, wrap it behind an internal HTTP service, or bridge to MCP for workflow automation.
Codex	Use the CLI inside the repo/workspace; use MCP on Codex surfaces that support local MCP servers.
OpenCode	Use the same generic local MCP pattern as Claude Code/Codex when MCP is available, or shell out to the CLI.
Claude Code	Configure `vault-mcp` as a local stdio MCP server, or use CLI commands in shell-capable sessions.
Any MCP-compatible agent	Run `vault-mcp --project-dir <project>` and follow `vault_search` → `vault_read_range` → answer with sources.

See Agent Integrations for setup patterns, OpenClaw adapter details, and runtime-specific notes.

Agent-facing install contract

Many Vault-for-LLM installs are performed by agents rather than by humans. For agent-driven setup or repo changes, use:

AGENTS.md — concise operating rules for coding agents.
agent_manifest.json — machine-readable install, scope, safety, runtime, and validation metadata.
docs/agent_install.md — short install runbook for Hermes, Codex, Claude Code, OpenClaw, OpenCode, n8n, and other agents.

Human users do not need to install everything manually. You can ask your agent:

Install Vault-for-LLM for this project. Read AGENTS.md and agent_manifest.json,
ask me whether the vault should be shared or private, ask which optional
features to enable, ask whether to install selected optional dependencies now,
ask whether semantic search should download a local ONNX embedding model, ask
whether I have an existing Obsidian vault to import, configure CLI/MCP, run the
first Obsidian import if requested, ask whether I want automatic Obsidian sync,
ask whether I want Profile / Dream / Forgetting memory-agent guidance,
and run a search/read/propose smoke test.

Agents should read those files before choosing a database scope, configuring MCP, installing optional features, or writing memory.

The common install architecture is the same across Hermes Agent, Codex, OpenCode, Claude Code, OpenClaw, and other MCP-capable agents:

choose projectDir -> choose optional features -> ask about Obsidian -> install vault -> configure CLI/MCP -> first import/sync check -> verify search/read/propose

Runtime-specific adapters should stay thin. The durable contract is the shared projectDir, vault CLI, vault-mcp, and candidate-first memory policy.

Agent installers should also ask about optional capabilities instead of enabling everything by default:

Feature	Default	Install command	Ask when
`core`	yes	`python -m pip install vault-for-llm==0.6.33`	Always: local Markdown, SQLite, keyword search.
`mcp`	yes for MCP-capable agents	`python -m pip install "vault-for-llm[mcp]==0.6.33"`	The runtime can connect local stdio MCP tools.
`obsidian_import`	no	built into core CLI	The user already has an Obsidian vault and wants agents to search those notes through Vault.
`semantic`	no	`python -m pip install "vault-for-llm[semantic]"`	The user wants embedding-backed semantic/hybrid search.
`supabase`	no	`python -m pip install "vault-for-llm[supabase]"`	The user wants optional remote sync/read paths.
`headroom`	no	`python -m pip install headroom-ai`	The agent often reads long logs, terminal output, or large retrieved context and needs optional compression before sending content to the LLM.
`memory_agents`	no	no extra dependency	The user wants Profile / Dream / Forgetting agent guidance with report-only and candidate-only defaults.
`dev`	no	`python -m pip install -e ".[dev]"`	Source checkout, benchmarks, PR work, or release validation.

When optional features are selected, vault setup-agent can install the chosen Python dependencies for the agent. Interactive setup asks before installing. Non-interactive agents should pass --install-optional-deps; semantic setup can also pass --install-embedding-model mix to download and configure the default local ONNX embedding model.

vault setup-agent \
  --non-interactive \
  --agent codex \
  --scope shared \
  --agent-project-dir ~/Vaults/my-project \
  --features core,mcp,semantic,supabase,headroom \
  --language en \
  --install-optional-deps \
  --install-embedding-model mix \
  --supabase-setup simple \
  --supabase-sync cron \
  --json

Do not silently enable semantic, Supabase, or Headroom extras: semantic and Supabase add heavier dependencies, model/provider setup, or remote credentials; Headroom is useful only when context-window or token pressure is a real issue. If Headroom is enabled, keep citations tied to original vault_read_range output, not compressed summaries.

For Obsidian, the agent should ask for the vault path, run a dry-run first, perform the first import only after confirmation, then ask whether to schedule the same vault import obsidian --compile command for ongoing sync.

Choose the Vault project scope

Vault-for-LLM is bound to the project-dir, not to a specific agent runtime:

one project directory = one vault.db

If Hermes, OpenClaw, Codex, Claude Code, and n8n all point to the same --project-dir, they share the same governed project memory. If they point to different directories, they use isolated databases.

Scope	Use when	Example project-dir
Shared project vault	Multiple trusted agents collaborate on the same confirmed project knowledge	`~/Vaults/my-project`
Agent-private vault	One agent is experimenting, noisy, or untrusted	`~/.openclaw/workspace/vault-project`
Domain/customer vault	Data boundaries must stay separate	`~/Vaults/clinic-customer-service`
Temporary vault	Demos, tests, and benchmarks	`/tmp/vault-benchmark-*`

/tmp/... paths are disposable test working directories. They are not the package install location and should not be used as long-lived shared memory. For real shared use, choose a stable project directory such as ~/Vaults/my-project and point every trusted agent at that same path. For scheduled jobs, also keep the Python virtualenv in a stable path such as ~/.hermes/venvs/vault-for-llm/; a venv under /tmp/... can disappear after reboot.

For shared vaults, prefer vault_memory_propose over direct writes so multiple agents do not pollute active memory before review.

For agents running on different machines, the local project-dir cannot be shared directly. In that case, optional Supabase sync can act as a remote shared read/sync layer: each host keeps its own local SQLite vault, then syncs approved knowledge, Document Map rows, summaries, hashes, and metadata to the same Supabase project. This lets Hermes on one host, Codex on another host, and n8n on a server read from a common project-memory view without making Supabase a required dependency for local use.

Current Source Status

The current source tree is 0.6.33. Core local search is stable, while advanced semantic, rerank, sync, and benchmarking workflows remain optional. See CHANGELOG.md for release details.

What it can do

Area	Capability
Knowledge storage	Markdown `raw/` files compiled into local SQLite
Search	FTS5/BM25 keyword search with fallback, optional vector search, hybrid search, query expansion
Reranking	lightweight zero-dependency reranker (default), optional Cross-Encoder reranker for production-grade relevance
Embeddings	optional ONNX Runtime or Ollama embeddings, provider guard, durable cache workflows
LLM enhancement	optional LLM-powered query rewriting for better retrieval recall
Memory layers	L0 identity, L1 core facts, L2 recent context, L3 deep knowledge
Knowledge graph	inferred entities/edges and graph expansion
Document Map	section/claim navigation and bounded `read_range` citations (policy and demo)
MCP	`vault-mcp` exposes search/add/stats/map/read plus candidate-first memory tools to compatible agents (MCP memory workflow)
Memory curator	`vault remember`, `vault promote`, and MCP propose/promote tools for gated autonomous memory writes
Dream reports	`vault dream` produces report-first memory curation summaries for stale, duplicate, weak, or poorly-described knowledge (dream workflow)
Quality tools	lint, freshness, convergence, cross-validation, dedup, Search QA snapshots (benchmarking guide), semantic smoke/warm workflows
Benchmarking	`benchmarks/search_benchmark.py` for reproducible before/after retrieval quality and latency comparison
Repository governance	source-checkout public-boundary gate, artifact audit, and safe-only cleanup helpers (governance guide)
Agent integrations	CLI/MCP patterns for Hermes Agent, OpenClaw, n8n, Codex, Claude Code, and generic MCP-compatible agents (integration guide)
Future retrieval layers	Design notes for Document Map tree navigation and optional Headroom context-budget integration (tree navigation, Headroom notes)
Optional remote sync	Supabase sync scripts for teams or remote read paths
Local skill registry	experimental `vault skill` commands for sharing reusable workflows inside a local Vault; not a hosted marketplace

Quality tools roadmap

These features exist today, but their maturity differs. Core local commands are the stable path; advanced QA, semantic, sync, and skill-registry workflows are still evolving:

Tool	Purpose	Maturity
Document Map	Navigate sections/claims and read bounded source ranges with citations	usable, still evolving
Search QA	Run fixed query sets and compare before/after retrieval metrics; see the benchmarking guide and source-checkout fixtures under `benchmarks/search_qa/`	usable for deterministic regression checks
Cross-Encoder reranker	Production-grade relevance scoring for search result reranking via cross-encoder models	usable with optional deps
Search benchmark framework	Reproducible before/after comparison of retrieval quality and latency across search strategies	usable
LLM query rewriting	LLM-powered query reformulation for improved retrieval recall	usable with optional deps
Convergence checks	Detect whether a knowledge entry has enough definition, procedure, and edge-case detail	experimental
Cross-validation	Verify extracted claims across different model families	experimental / optional-model dependent
Freshness + dedup	Mark stale entries and detect repeated knowledge	experimental
Local skill registry	Push/search/pull reusable agent workflows in local SQLite	experimental / local-only
Repo hygiene scripts	Audit generated artifacts, clean safe caches, and scan public PR diffs before release	source-checkout helper

The benchmarks/search_qa/ examples are repository fixtures in a source checkout, not files installed by the PyPI wheel. After pip install vault-for-llm, run vault search-qa with your own QA JSON files, or clone/download this repository to use the example fixtures.

The stable path is still the core loop: vault init → vault add/vault remember → vault compile/vault promote → vault search → vault-mcp. For autonomous agents, prefer vault_memory_propose over direct vault_add.

Think of direct vault_add as letting someone walk straight into the archive and put a note on the shelf. It is still available for trusted scripts, but the safer daily path is the candidate desk: propose first, inspect gates, then promote.

Architecture

L0 Identity        → who the user/project is; loaded every session
L1 Core Facts      → stable environment and project facts; loaded every session
L2 Recent Context  → recent decisions, incidents, and working context
L3 Deep Knowledge  → lessons, APIs, architecture, troubleshooting; searched on demand

Markdown raw/  →  vault compile  →  SQLite database  →  vault search / MCP tools

This keeps the agent prompt small while still making deeper memory available when relevant.

L0-L3 describe memory depth, not permissions. For multi-agent installs, keep the layer model stable and use metadata such as scope, sensitivity, owner_agent, allowed_agents, status, memory_type, and expires_at to decide what stays private, what can be shared, and what may sync to Supabase or Obsidian. See memory governance layers.

User profile memory should be split instead of stored as one large profile: minimal identity belongs in L0, durable work preferences in L1, recent state/care summaries in L2 with expiry, and deep or raw private analysis in a private L3 entry or separate private vault.

Agent memory lifecycle

Conversation / task
  → propose memory candidate
  → privacy + duplicate + metadata + quality gates
  → list / review candidate memories
  → promote reviewed memory
  → raw Markdown + SQLite active knowledge
  → search / map / read_range recall
  → dream report for cleanup and safe metadata fixes

In story form: the agent writes a note, the front desk checks whether it is safe and useful, the librarian shelves it only after review, and later the agent asks the catalog for just the right shelf and paragraph.

Installation

Install from PyPI

Vault-for-LLM 0.6.33 is published on PyPI.

For agent-driven installation, paste this into Hermes Agent, Codex, OpenCode, Claude Code, OpenClaw, or another agent that can run local commands:

Install Vault-for-LLM for this project. Use PyPI package vault-for-llm[mcp]==0.6.33.
Ask whether the vault database should be shared, private, domain-specific, or temporary.
Ask separately about MCP, semantic search, Supabase sync, Headroom context compression,
and dev/benchmark dependencies. If optional features are selected, ask whether to
install their dependencies now. If semantic is selected, ask whether to download
a local ONNX embedding model. Ask whether I have an existing Obsidian vault to
import. Run vault setup-agent, configure CLI/MCP, do an Obsidian dry-run before
importing, and finish with a search/read/propose smoke test.

Manual install:

python3 -m venv .venv
source .venv/bin/activate
pip install "vault-for-llm[mcp]==0.6.33"

vault setup-agent

Optional semantic search

Keyword search works with the base install. For local ONNX embeddings:

pip install "vault-for-llm[semantic]"
vault install-embedding --model mix

Or use an existing Ollama embedding model:

vault config set embedding.provider ollama
vault config set embedding.model nomic-embed-text

Optional MCP server

pip install "vault-for-llm[mcp]"
vault-mcp --project-dir /path/to/your/project --tool-profile core

Security note: vault-mcp is a local stdio MCP server. It does not implement network authentication or user-level access control. Only configure it for agents you trust with read/write access to the selected --project-dir, and prefer a dedicated project directory for shared or experimental agents.

Development install from source

git clone https://github.com/zycaskevin/Vault-for-LLM.git
cd Vault-for-LLM
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

Optional Supabase dependency

Supabase sync is optional. Install its dependency only when you want a remote sync/read path:

pip install "vault-for-llm[supabase]"

Quickstart

# 1. Create a vault in your project
vault init

# 2. Add a first knowledge entry
vault add "First lesson" --content "The bug was caused by X. The fix was Y."

# 3. Compile Markdown into the local SQLite vault
vault compile

# 4. Search it later
vault search "what caused the bug"

You can also add Markdown files directly under raw/ and run vault compile.

Candidate-first agent memory

For autonomous agents or unreviewed memories, prefer the safer candidate workflow. This is the recommended path after PR27:

vault remember "Memory title" \
  --content "Markdown memory content" \
  --reason "Why this is worth remembering"

vault candidates --include-gates

# after review
vault promote mem_xxxxxxxxxxxx --confirm

vault candidates lists the review queue without dumping full raw content by default. MCP-compatible agents should use vault_memory_propose, vault_memory_candidates, and vault_memory_promote; see MCP memory workflow.

The gates are intentionally simple and deterministic:

Gate	Plain-language job
Privacy	“Does this look like a secret or private data?”
Duplicate	“Do we already have this memory or a near copy?”
Metadata	“Does it at least have a title/content/reason?”
Quality	“Is this specific enough to be useful and findable later?”

Search QA: checking whether memory recall is healthy

Search QA is a small exam for your vault. Some questions should find a known note; some hard-negative questions should find nothing. This helps catch both kinds of mistakes: forgetting the right memory and confidently returning the wrong one.

vault search-qa run \
  --qa-file benchmarks/search_qa/basic.en.json \
  --mode keyword \
  --min-score 0.34 \
  --output /tmp/searchqa.json

Fixtures can use expected_no_results: true for “do not return anything” checks. See the Search QA benchmarking guide.

Dream curation reports

Run a report-first memory curation pass:

vault dream --mode report --limit 50 --write-report

Reports are written under reports/dream/. apply_safe can apply only narrow metadata fixes, and it writes a plan plus backup path so you can roll back if the cleanup was not what you wanted. See dream workflow.

Example entry:

---
title: "Postgres migration pitfall"
category: "error"
layer: L3
tags: ["postgres", "migration"]
trust: 0.8
source: "project-notes"
created: "2026-05-16"
---

# Postgres migration pitfall

What broke, why it broke, and how to avoid it next time.

Optional semantic workflow

Semantic search is optional by design. The base install keeps working with keyword search only. After configuring a real embedding provider, the main operator commands are:

vault semantic rebuild --persist-cache
vault search "what caused the bug" --mode semantic
vault search "what caused the bug" --mode hybrid
vault semantic smoke --qa-file benchmarks/search_qa/basic.en.json --mode semantic --pretty
vault semantic cache-stats --pretty

vault search --mode semantic reads stored semantic_vectors directly. --mode hybrid fuses keyword results with the stored semantic index when available, and falls back safely when it is not.

Search QA can also run semantic/hybrid snapshots, but the QA command must use the same provider/model/dimension and vector kind used to rebuild semantic_vectors. For deterministic local smoke tests, rebuild with --allow-hash --hash-dim N and pass the same flags to vault search-qa run; hash vectors validate plumbing only and are not a semantic-quality benchmark.

For the full lifecycle — warm, cache-prune, startup, daemon, and the --allow-hash test-only provider — see docs/semantic_search.md.

Directory structure

your-project/
├── L0-identity/              # user or project identity loaded every session
│   └── identity.md
├── L1-core-facts/            # stable facts loaded every session
│   └── current-projects.md
├── L2-context/               # recent context, decisions, incidents
│   └── recent-sessions/
├── L3-knowledge/             # deep knowledge organized for retrieval
├── raw/                      # source Markdown knowledge entries
├── compiled/                 # compiled / compressed knowledge artifacts
├── vault.db             # local SQLite database generated by vault
└── templates/                # starter templates

Common CLI Commands

Command	Purpose
`vault init`	Initialize a project vault
`vault setup-agent`	Run the interactive agent installer and optional Obsidian sync template generator
`vault remember "Title" --content "..." --reason "..."`	Propose candidate memory for review
`vault candidates`	List pending candidate memories
`vault promote <candidate_id> --confirm`	Promote reviewed candidate memory
`vault compile`	Compile Markdown into SQLite
`vault import obsidian --vault /path/to/ObsidianVault --dry-run`	Preview importing existing Obsidian notes into `raw/obsidian/`
`vault search "query"`	Search project memory
`vault map read <id> --lines 10-30`	Read a bounded range for citation
`vault remove <id> --confirm`	Remove a reviewed knowledge entry by ID

For the broader command surface, see the CLI reference.

Agent setup wizard

Use docs/agent_install.md plus vault setup-agent or its alias vault install-agent when an agent should guide the installation instead of asking a human to run every command manually:

vault setup-agent

vault setup-agent \
  --non-interactive \
  --agent codex \
  --scope shared \
  --agent-project-dir ~/Vaults/my-project \
  --features core,mcp,obsidian_import \
  --obsidian-vault ~/Documents/ObsidianVault \
  --import-obsidian \
  --obsidian-sync all

The wizard asks for database scope, project directory, setup language, MCP, semantic search, Supabase sync, Headroom context compression, developer/benchmark dependencies, whether to install selected optional dependencies now, an existing Obsidian vault path, whether to run the first import, and whether to generate cron, LaunchAgent, or n8n sync templates. Semantic, Supabase, Headroom, and dev dependencies default to off. If semantic is selected and dependency installation is confirmed, the wizard can also download and configure a local ONNX embedding model. headroom is an advanced optional feature for context compression; it is not required for Vault memory governance and should stay off unless the user has long logs, large tool output, or token pressure.

Obsidian export

Use vault export obsidian when you want humans to browse the compiled vault in Obsidian without changing the source knowledge base:

vault export obsidian \
  --vault /path/to/ObsidianVault \
  --category technique \
  --dry-run

The export is intentionally one-way and read-only: it reads from vault.db, writes Markdown notes under 00-Vault-Knowledge/, includes YAML frontmatter plus Vault #<id> citations, and does not write back to raw/, compiled/, SQLite, or any remote sync target. Re-running the command overwrites the same stable note paths instead of creating duplicates.

Obsidian import and sync

If a user already has an Obsidian vault, agents can import those Markdown notes back into Vault:

vault import obsidian \
  --vault /path/to/ObsidianVault \
  --dry-run

vault import obsidian \
  --vault /path/to/ObsidianVault \
  --compile

The import path copies user-authored notes into raw/obsidian/, preserves the original Obsidian path and content hash in frontmatter, and skips .obsidian/, .trash/, .git/, and 00-Vault-Knowledge/ by default. This keeps Vault's own exported notes from being re-imported as source material.

Use --dry-run first when connecting an existing vault. Re-running the import is idempotent: unchanged notes are skipped, changed notes update the same raw path, and --compile is the explicit step that writes the imported notes into vault.db. For automatic sync, schedule the same command with cron, LaunchAgent, n8n, or an agent installer; no always-on watcher is required for the first version.

For citation-safe memory use, see the Document Map citation policy: search results are navigation hints, while vault map read returns bounded source text for final citations.

MCP integration

Install MCP extras and start the server:

pip install "vault-for-llm[mcp]"
vault-mcp --project-dir /path/to/your/project --tool-profile core

Example MCP server config:

{
  "mcpServers": {
    "vault": {
      "command": "vault-mcp",
      "args": ["--project-dir", "/path/to/your/project"]
    }
  }
}

MCP can expose different tool profiles:

Profile	Tools	Use when
`core`	`vault_search`, `vault_read_range`, `vault_memory_propose`, `vault_stats`	Daily agent use with fewer tool-schema tokens
`review`	Core plus `vault_memory_candidates`, `vault_memory_promote`, `vault_dream_run`	A trusted operator or agent reviews candidate memory
`remote`	Core plus Supabase remote read tools	Agents read a synced cross-host memory view
`maintenance`	Review plus freshness/convergence checks	Scheduled or operator-led curation
`full`	All tools, including compatibility `vault_add`	Backward compatibility or explicit power-user setups

full remains the default for backward compatibility. For production agent sessions, prefer --tool-profile core or an explicit allowlist:

vault-mcp --project-dir /path/to/project \
  --tools vault_search,vault_read_range,vault_memory_propose,vault_stats

Tool profiles reduce the tools advertised through tools/list; they are not a security boundary. Run vault-mcp only for agents you trust with the selected project directory.

For agent loops, prefer vault_search → vault_read_range. vault_search returns compact MCP payloads by default, including source and range hints when available. Use vault_map_show from a broader profile only when the agent needs section navigation before reading. Final answers should cite vault_read_range output rather than search previews.

Optional Supabase sync

Core Vault-for-LLM usage is local-only. Supabase support is for teams or remote read paths that want a synced copy of local SQLite data.

The local SQLite database remains the source of truth. Supabase is an optional sync/read target. Remote table names use Vault-branded defaults and can be overridden with VAULT_SUPABASE_*_TABLE environment variables when integrating an existing private schema.

This is useful when multiple hosts need to share project memory. For example, Hermes Agent on a workstation, Codex on a laptop, OpenClaw on another machine, and n8n on a server can all use local Vaults while syncing approved memory to one Supabase project for cross-host recall.

Knowledge and skill sync use a minimal-disclosure default: metadata, summaries, hashes, Document Map rows, and claims sync without full content_raw. Use --include-content only when you intentionally want full local content copied to Supabase; fail-severity privacy findings are still withheld.

Start with simple sync. RLS, multi-agent allow-lists, and Coze read-only access are advanced setup topics; see docs/supabase_setup.md when you need them.

# optional integration dependency
pip install supabase

# configure Supabase credentials in your environment, then run sync scripts as needed
python -m scripts.sync_to_supabase --db /path/to/project/vault.db --document-map --health

# or let setup-agent generate daily cron, LaunchAgent, or n8n templates
vault setup-agent \
  --non-interactive \
  --agent nancy \
  --scope shared \
  --agent-project-dir /path/to/project \
  --features core,mcp,supabase \
  --language en \
  --install-optional-deps \
  --supabase-setup simple \
  --supabase-sync cron \
  --json

Current maturity

Vault-for-LLM is CLI-first developer tooling:

Core local commands (init, add, compile, search) are the most stable path.
Search QA, FTS5/BM25 keyword search, Document Map citation reads, and semantic workflow commands are usable but still evolving.
Optional integrations such as Supabase sync, MCP, and local skill registry may change before a stable 1.0 release.
The default install is available from PyPI; source installs are for development.

If you want the most stable path, start with:

vault init
vault add
vault compile
vault search

Retrieval quality (Search QA benchmarks)

Evidence snapshot

Vault-for-LLM is measured as a retrieval and project-memory QA layer, not only as a note database. These numbers are evidence probes, not universal guarantees; larger or different corpora should be re-tested with the included benchmark commands.

Probe	Result	Caveat
Repo onboarding fixture	Vault top-k/source/read-range guidance `28/28`; Codex transcript baseline `7/28`; Hermes/Nancy transcript baseline `3/28`	28-task source-aware project benchmark; private transcripts are not committed
Candidate-first memory	`0` active-memory pollution before promotion	candidate proposals do not enter official memory automatically
LoCoMo hierarchical retrieval probe	`97.7%` Any evidence@50 and `90.5%` All evidence@50 on official-scored categories	retrieval evidence score only; not an official answer/judge leaderboard score

See Agent Onboarding Benchmark for the reproducible repo fixture and exported-session comparison workflow.

Search QA fixture

Vault-for-LLM ships deterministic Search QA fixtures that measure retrieval quality before and after code changes. Results below use the English fixture (benchmarks/search_qa/basic.en.json) against a fresh database compiled from the same fixture data (keyword/FTS5 mode):

Metric	Value
total_cases	3
top-1 recall	2/3 ≈ 67%
top-k recall	2/3 ≈ 67%
no-result precision	1.0
Mean Reciprocal Rank	0.67

The benchmark covers:

en_document_map_read_range — "tool-gated reading map navigation read_range evidence" → expects "Tool-gated Reading"
en_citation_policy_boundary — "citation policy boundary final answer support" → expects "Citation Policy Boundary"
en_no_result_control — random string query → expects no results (false-positive check)

A Chinese counterpart (basic.zh-Hant.json) is also available but uses the same synthetic knowledge, so metrics are identical.

To run locally:

python -m pytest tests/test_search_quality_metrics.py -v

Semantic/hybrid mode requires an embedding model (--allow-hash for CI smoke). Results may vary — keyword search is the stable baseline.

Development

python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
python -m pytest -q

Some optional test paths require optional dependencies such as ONNX, MCP, or Supabase.

License

Apache-2.0

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.6.48

Jun 23, 2026

0.6.45

Jun 23, 2026

0.6.44

Jun 22, 2026

0.6.43

Jun 22, 2026

0.6.42

Jun 22, 2026

0.6.41

Jun 22, 2026

0.6.40

Jun 22, 2026

0.6.39

Jun 22, 2026

0.6.38

Jun 22, 2026

0.6.37

Jun 22, 2026

0.6.36

Jun 22, 2026

0.6.35

Jun 22, 2026

0.6.34

Jun 22, 2026

This version

0.6.33

Jun 22, 2026

0.6.32

Jun 22, 2026

0.6.31

Jun 22, 2026

0.6.30

Jun 22, 2026

0.6.29

Jun 22, 2026

0.6.28

Jun 22, 2026

0.6.27

Jun 22, 2026

0.6.26

Jun 21, 2026

0.6.25

Jun 21, 2026

0.6.24

Jun 21, 2026

0.6.23

Jun 21, 2026

0.6.22

Jun 20, 2026

0.6.21

Jun 17, 2026

0.4.2

May 17, 2026

0.4.1

May 17, 2026

0.4.0

May 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vault_for_llm-0.6.33.tar.gz (455.6 kB view details)

Uploaded Jun 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vault_for_llm-0.6.33-py3-none-any.whl (271.2 kB view details)

Uploaded Jun 22, 2026 Python 3

File details

Details for the file vault_for_llm-0.6.33.tar.gz.

File metadata

Download URL: vault_for_llm-0.6.33.tar.gz
Upload date: Jun 22, 2026
Size: 455.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vault_for_llm-0.6.33.tar.gz
Algorithm	Hash digest
SHA256	`b5a8d33a1834a245effc79501e68d0b0ca87147359104d0e63699c3ccbdda905`
MD5	`a34b143ce54a53ab99d2b285a5ea4e45`
BLAKE2b-256	`1bc5b6606d489696a92dd6a67370af27c4311c389dbf12a9825167502e6cfe12`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vault_for_llm-0.6.33.tar.gz:

Publisher: publish.yml on zycaskevin/Vault-for-LLM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vault_for_llm-0.6.33.tar.gz
- Subject digest: b5a8d33a1834a245effc79501e68d0b0ca87147359104d0e63699c3ccbdda905
- Sigstore transparency entry: 1908846233
- Sigstore integration time: Jun 22, 2026
Source repository:
- Permalink: zycaskevin/Vault-for-LLM@1b841f599ad5640c0397f237a035be565fbe4093
- Branch / Tag: refs/tags/v0.6.33
- Owner: https://github.com/zycaskevin
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@1b841f599ad5640c0397f237a035be565fbe4093
- Trigger Event: release

File details

Details for the file vault_for_llm-0.6.33-py3-none-any.whl.

File metadata

Download URL: vault_for_llm-0.6.33-py3-none-any.whl
Upload date: Jun 22, 2026
Size: 271.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vault_for_llm-0.6.33-py3-none-any.whl
Algorithm	Hash digest
SHA256	`575f972b32b06ead26964eaec6bb55dd044dda1184d6416a0ede6754fa0f278f`
MD5	`8c7d6f9069bf887b05846e94ce67d34f`
BLAKE2b-256	`5dddb691396b2d6e8d15eee1113b78a2376779238a9de5d0820c914ef0e92bb3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vault_for_llm-0.6.33-py3-none-any.whl:

Publisher: publish.yml on zycaskevin/Vault-for-LLM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vault_for_llm-0.6.33-py3-none-any.whl
- Subject digest: 575f972b32b06ead26964eaec6bb55dd044dda1184d6416a0ede6754fa0f278f
- Sigstore transparency entry: 1908846320
- Sigstore integration time: Jun 22, 2026
Source repository:
- Permalink: zycaskevin/Vault-for-LLM@1b841f599ad5640c0397f237a035be565fbe4093
- Branch / Tag: refs/tags/v0.6.33
- Owner: https://github.com/zycaskevin
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@1b841f599ad5640c0397f237a035be565fbe4093
- Trigger Event: release

vault-for-llm 0.6.33

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Vault-for-LLM

Why this exists

What makes it different

Core principles

Works across agent systems

Agent-facing install contract

Choose the Vault project scope

Current Source Status

What it can do

Quality tools roadmap

Architecture

Agent memory lifecycle

Installation

Install from PyPI

Optional semantic search

Optional MCP server

Development install from source

Optional Supabase dependency

Quickstart

Candidate-first agent memory

Search QA: checking whether memory recall is healthy

Dream curation reports

Optional semantic workflow

Directory structure

Common CLI Commands

Agent setup wizard

Obsidian export

Obsidian import and sync

MCP integration

Optional Supabase sync

Current maturity

Retrieval quality (Search QA benchmarks)

Evidence snapshot

Search QA fixture

Development

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance