Generic markdown vault MCP server with FTS5 + semantic search
Project description
markdown-vault-mcp
A generic markdown collection MCP server with FTS5 full-text search, semantic vector search, frontmatter-aware indexing, incremental reindexing, and non-markdown attachment support.
Point it at a directory of Markdown files (an Obsidian vault, a docs folder, a Zettelkasten) and it exposes search, read, write, and edit tools over the Model Context Protocol.
Features
- Full-text search — SQLite FTS5 with BM25 scoring, porter stemming
- Semantic search — cosine similarity over embedding vectors (Ollama, OpenAI, or Sentence Transformers)
- Hybrid search — Reciprocal Rank Fusion combining FTS5 and vector results
- Frontmatter-aware — indexes YAML frontmatter fields, supports required field enforcement
- Incremental reindexing — hash-based change detection, only re-processes modified files
- Write operations — create, edit, delete, rename documents with automatic index updates
- Attachment support — read, write, delete, and list non-markdown files (PDFs, images, etc.)
- Git integration — optional auto-commit and push on every write via
GIT_ASKPASS - OIDC authentication — optional token-based auth for HTTP deployments (Authelia, Keycloak, etc.)
- MCP tools — 13 tools including search, read, write, edit, delete, rename, and admin operations
Installation
From PyPI
pip install markdown-vault-mcp
With optional dependencies:
pip install markdown-vault-mcp[mcp] # FastMCP server
pip install markdown-vault-mcp[embeddings-api] # Ollama/OpenAI embeddings via HTTP
pip install markdown-vault-mcp[all] # MCP + API embeddings (lightweight, no PyTorch)
pip install markdown-vault-mcp[all-local] # + sentence-transformers + PyTorch (large)
[all]vs[all-local]: The[all]extra is lightweight and does not includesentence-transformersor PyTorch. Use[all-local]if you want local CPU/GPU embeddings without Ollama. The Docker image uses[all].
From source
git clone https://github.com/pvliesdonk/markdown-vault-mcp.git
cd markdown-vault-mcp
pip install -e ".[all,dev]"
Docker
docker pull ghcr.io/pvliesdonk/markdown-vault-mcp:latest
The Docker image uses [all] (MCP + API embeddings). It does not include sentence-transformers or PyTorch — use Ollama or OpenAI for embeddings. For local sentence-transformers, build from source with [all-local].
Quick Start
As a library
from pathlib import Path
from markdown_vault_mcp import Collection
collection = Collection(source_dir=Path("/path/to/vault"))
results = collection.search("query text", limit=10)
As an MCP server
export MARKDOWN_VAULT_MCP_SOURCE_DIR=/path/to/vault
markdown-vault-mcp serve
With Docker Compose
-
Copy an example env file:
cp examples/obsidian-readonly.env .env
-
Edit
.envto setMARKDOWN_VAULT_MCP_SOURCE_DIRto the absolute path of your vault on the host. -
Start the service:
docker compose up -d
-
Check the logs:
docker compose logs -f markdown-vault-mcp
Example env files
| File | Description |
|---|---|
examples/obsidian-readonly.env |
Obsidian vault, read-only, Ollama embeddings |
examples/obsidian-readwrite.env |
Obsidian vault, read-write with git auto-commit |
examples/obsidian-oidc.env |
Obsidian vault, read-only, OIDC authentication (Authelia) |
examples/ifcraftcorpus.env |
Strict frontmatter enforcement, read-only corpus |
For reverse proxy (Traefik) and deployment setup, see docs/deployment.md.
Configuration
All configuration is via environment variables with the MARKDOWN_VAULT_MCP_ prefix (except embedding provider settings, which use their own conventions).
Core
| Variable | Default | Required | Description |
|---|---|---|---|
MARKDOWN_VAULT_MCP_SOURCE_DIR |
— | Yes | Path to the markdown vault directory |
MARKDOWN_VAULT_MCP_READ_ONLY |
true |
No | Set to false to enable write operations |
MARKDOWN_VAULT_MCP_INDEX_PATH |
in-memory | No | Path to the SQLite FTS5 index file; set for persistence across restarts |
MARKDOWN_VAULT_MCP_EMBEDDINGS_PATH |
disabled | No | Path to the numpy embeddings file; required to enable semantic search |
MARKDOWN_VAULT_MCP_STATE_PATH |
{SOURCE_DIR}/.markdown_vault_mcp/state.json |
No | Path to the change-tracking state file |
MARKDOWN_VAULT_MCP_INDEXED_FIELDS |
— | No | Comma-separated frontmatter fields to promote to the tag index for structured filtering |
MARKDOWN_VAULT_MCP_REQUIRED_FIELDS |
— | No | Comma-separated frontmatter fields required on every document; documents missing any are excluded from the index |
MARKDOWN_VAULT_MCP_EXCLUDE |
— | No | Comma-separated glob patterns to exclude from scanning (e.g. .obsidian/**,.trash/**) |
Server identity
| Variable | Default | Description |
|---|---|---|
MARKDOWN_VAULT_MCP_SERVER_NAME |
markdown-vault-mcp |
MCP server name shown to clients; useful for multi-instance setups |
MARKDOWN_VAULT_MCP_INSTRUCTIONS |
(auto) | System-level instructions injected into LLM context; defaults to a description that reflects read-only vs read-write state |
Search and embeddings
| Variable | Default | Description |
|---|---|---|
EMBEDDING_PROVIDER |
auto-detect | Embedding provider: ollama, openai, or sentence-transformers (not MARKDOWN_VAULT_MCP_-prefixed) |
OLLAMA_HOST |
http://localhost:11434 |
Ollama server URL (not MARKDOWN_VAULT_MCP_-prefixed) |
OPENAI_API_KEY |
— | OpenAI API key for the OpenAI embedding provider (not MARKDOWN_VAULT_MCP_-prefixed) |
MARKDOWN_VAULT_MCP_OLLAMA_MODEL |
nomic-embed-text |
Ollama embedding model name |
MARKDOWN_VAULT_MCP_OLLAMA_CPU_ONLY |
false |
Force Ollama to use CPU only |
Git integration
Enables auto-commit and push on every write. Requires MARKDOWN_VAULT_MCP_READ_ONLY=false.
| Variable | Default | Description |
|---|---|---|
MARKDOWN_VAULT_MCP_GIT_TOKEN |
— | GitHub/GitLab PAT; when set, every write triggers a git commit and deferred push via GIT_ASKPASS |
MARKDOWN_VAULT_MCP_GIT_PUSH_DELAY_S |
30 |
Seconds of write-idle time before pushing; 0 = push only on shutdown |
MARKDOWN_VAULT_MCP_GIT_COMMIT_NAME |
markdown-vault-mcp |
Git committer name for auto-commits; set this in Docker where git config user.name is empty |
MARKDOWN_VAULT_MCP_GIT_COMMIT_EMAIL |
noreply@markdown-vault-mcp |
Git committer email for auto-commits |
Attachments
Non-markdown file support. See Attachments for details.
| Variable | Default | Description |
|---|---|---|
MARKDOWN_VAULT_MCP_ATTACHMENT_EXTENSIONS |
(built-in list) | Comma-separated allowed extensions without dot (e.g. pdf,png,jpg); use * to allow all non-.md files |
MARKDOWN_VAULT_MCP_MAX_ATTACHMENT_SIZE_MB |
10.0 |
Maximum attachment size in MB for reads and writes; 0 disables the limit |
OIDC authentication
Optional token-based authentication for HTTP deployments. OIDC activates when all four required variables are set. See Authentication for setup details.
| Variable | Required | Description |
|---|---|---|
MARKDOWN_VAULT_MCP_BASE_URL |
Yes | Public base URL of the server (e.g. https://mcp.example.com) |
MARKDOWN_VAULT_MCP_OIDC_CONFIG_URL |
Yes | OIDC discovery endpoint (e.g. https://auth.example.com/.well-known/openid-configuration) |
MARKDOWN_VAULT_MCP_OIDC_CLIENT_ID |
Yes | OIDC client ID registered with your provider |
MARKDOWN_VAULT_MCP_OIDC_CLIENT_SECRET |
Yes | OIDC client secret |
MARKDOWN_VAULT_MCP_OIDC_JWT_SIGNING_KEY |
No | JWT signing key; required on Linux/Docker — the default is ephemeral and invalidates tokens on restart. Generate with openssl rand -hex 32 |
MARKDOWN_VAULT_MCP_OIDC_AUDIENCE |
No | Expected JWT audience claim; leave unset if your provider does not set one |
MARKDOWN_VAULT_MCP_OIDC_REQUIRED_SCOPES |
No | Comma-separated required scopes; default openid |
CLI Reference
markdown-vault-mcp <command> [options]
serve
Start the MCP server.
markdown-vault-mcp serve [--transport {stdio|sse|http}] [--host HOST] [--port PORT]
| Flag | Default | Description |
|---|---|---|
--transport |
stdio |
MCP transport: stdio (stdin/stdout, default), sse (Server-Sent Events), http (streamable-HTTP). Use http for Docker with a reverse proxy or when OIDC is enabled. |
--host |
0.0.0.0 |
Bind host for the http transport (ignored for stdio and sse) |
--port |
8000 |
Port for the http transport (ignored for stdio and sse) |
index
Build the full-text search index.
markdown-vault-mcp index [--source-dir PATH] [--index-path PATH] [--force]
search
Search the collection from the CLI.
markdown-vault-mcp search <query> [-n LIMIT] [-m {keyword|semantic|hybrid}] [--folder PATH] [--json]
reindex
Incrementally reindex the vault (only processes changed files).
markdown-vault-mcp reindex [--source-dir PATH] [--index-path PATH]
MCP Tools
| Tool | Description |
|---|---|
search |
Hybrid full-text + semantic search with optional frontmatter filters |
read |
Read a document or attachment by relative path |
write |
Create or overwrite a document or attachment |
edit |
Replace a unique text span in a document (notes only) |
delete |
Delete a document or attachment and its index entries |
rename |
Rename/move a document or attachment, updating all index entries |
list_documents |
List indexed documents; pass include_attachments=true to also list non-markdown files |
list_folders |
List all folder paths in the vault |
list_tags |
List all unique frontmatter tag values |
reindex |
Force a full reindex of the vault |
stats |
Get collection statistics (document count, chunk count, etc.) |
build_embeddings |
Build or rebuild vector embeddings for semantic search |
embeddings_status |
Check embedding provider and index status |
Write tools (write, edit, delete, rename) are only available when MARKDOWN_VAULT_MCP_READ_ONLY=false.
Attachments
In addition to Markdown notes, the server can read, write, delete, rename, and list non-markdown files (PDFs, images, spreadsheets, etc.). All existing tools are overloaded — no new tool names.
How it works
Path dispatch is extension-based: a path ending in .md is treated as a note; any other path is treated as an attachment if the extension is in the allowlist. The kind field on returned objects distinguishes the two: "note" or "attachment".
Reading attachments
read returns base64-encoded content for binary attachments:
{
"path": "assets/diagram.pdf",
"mime_type": "application/pdf",
"size_bytes": 12345,
"content_base64": "<base64 string>",
"modified_at": 1741564800.0
}
Writing attachments
write accepts a content_base64 parameter for binary content:
{ "path": "assets/diagram.pdf", "content_base64": "<base64 string>" }
Listing attachments
list_documents with include_attachments=true returns both notes and attachments:
[
{ "path": "notes/intro.md", "kind": "note", "title": "Intro", "folder": "notes", "frontmatter": {}, "modified_at": 1741564800.0 },
{ "path": "assets/diagram.pdf", "kind": "attachment", "folder": "assets", "mime_type": "application/pdf", "size_bytes": 12345, "modified_at": 1741564800.0 }
]
Default allowed extensions
pdf, docx, xlsx, pptx, odt, ods, odp, png, jpg, jpeg, gif, webp, svg, bmp, tiff, zip, tar, gz, mp3, mp4, wav, ogg, txt, csv, tsv, json, yaml, toml, xml, html, css, js, ts
Override with MARKDOWN_VAULT_MCP_ATTACHMENT_EXTENSIONS. Use * to allow all non-.md files.
Hidden directories: Attachments inside hidden directories (
.git/,.obsidian/,.markdown_vault_mcp/, etc.) are never listed, regardless of extension settings.MARKDOWN_VAULT_MCP_EXCLUDEpatterns are also applied to attachments.
Authentication
OIDC authentication is optional and activates automatically when all four required variables (BASE_URL, OIDC_CONFIG_URL, OIDC_CLIENT_ID, OIDC_CLIENT_SECRET) are set.
OIDC requires --transport http (or sse). It has no effect with --transport stdio.
Setup with Authelia
Note: Authelia does not support Dynamic Client Registration (RFC 7591). Clients must be registered manually in
configuration.yml.
-
Register the client in Authelia:
identity_providers: oidc: clients: - client_id: markdown-vault-mcp client_secret: '$pbkdf2-sha512$...' # authelia crypto hash generate redirect_uris: - https://mcp.example.com/auth/callback grant_types: [authorization_code] response_types: [code] pkce_challenge_method: S256 scopes: [openid, profile, email]
-
Set the environment variables (see also
examples/obsidian-oidc.env):MARKDOWN_VAULT_MCP_BASE_URL=https://mcp.example.com MARKDOWN_VAULT_MCP_OIDC_CONFIG_URL=https://auth.example.com/.well-known/openid-configuration MARKDOWN_VAULT_MCP_OIDC_CLIENT_ID=markdown-vault-mcp MARKDOWN_VAULT_MCP_OIDC_CLIENT_SECRET=your-client-secret MARKDOWN_VAULT_MCP_OIDC_JWT_SIGNING_KEY=$(openssl rand -hex 32)
-
Start with HTTP transport:
markdown-vault-mcp serve --transport http --port 8000
JWT signing key
The FastMCP default signing key is ephemeral (regenerated on startup), which forces clients to re-authenticate after every restart. Set MARKDOWN_VAULT_MCP_OIDC_JWT_SIGNING_KEY to a stable random secret to avoid this:
# Generate once, store in your .env file
openssl rand -hex 32
Development
git clone https://github.com/pvliesdonk/markdown-vault-mcp.git
cd markdown-vault-mcp
uv pip install -e ".[all,dev]"
# Run tests
uv run python -m pytest tests/ -x -q
# Lint and format
ruff check src/ tests/
ruff format src/ tests/
# Type check
mypy src/
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file markdown_vault_mcp-1.3.1.tar.gz.
File metadata
- Download URL: markdown_vault_mcp-1.3.1.tar.gz
- Upload date:
- Size: 302.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
38bed83283c770de83e499506c97b0698543e1597f3fbbda9e8e38c03f9b92fe
|
|
| MD5 |
fec72d29a2618b20f2b8de333e05184c
|
|
| BLAKE2b-256 |
a718dcd0331c4a28fd2fb138a39775c92bcd6c8c92754de14e938c2299fb41ac
|
Provenance
The following attestation bundles were made for markdown_vault_mcp-1.3.1.tar.gz:
Publisher:
release.yml on pvliesdonk/markdown-vault-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markdown_vault_mcp-1.3.1.tar.gz -
Subject digest:
38bed83283c770de83e499506c97b0698543e1597f3fbbda9e8e38c03f9b92fe - Sigstore transparency entry: 1075856408
- Sigstore integration time:
-
Permalink:
pvliesdonk/markdown-vault-mcp@79c0f5f55947319dd7d9f43d1219d79860a72c34 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/pvliesdonk
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@79c0f5f55947319dd7d9f43d1219d79860a72c34 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file markdown_vault_mcp-1.3.1-py3-none-any.whl.
File metadata
- Download URL: markdown_vault_mcp-1.3.1-py3-none-any.whl
- Upload date:
- Size: 57.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c64f1c0da140ed810a17b30ce88d772d80e1ff4ce375cd1f1693d1eea99504ca
|
|
| MD5 |
391f12efc4b7473df089ac3ebba156ad
|
|
| BLAKE2b-256 |
23d910cf561d79ef0ea70a6e0748354f05ccfbd80aff91f99c5da62189e0ac94
|
Provenance
The following attestation bundles were made for markdown_vault_mcp-1.3.1-py3-none-any.whl:
Publisher:
release.yml on pvliesdonk/markdown-vault-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markdown_vault_mcp-1.3.1-py3-none-any.whl -
Subject digest:
c64f1c0da140ed810a17b30ce88d772d80e1ff4ce375cd1f1693d1eea99504ca - Sigstore transparency entry: 1075856446
- Sigstore integration time:
-
Permalink:
pvliesdonk/markdown-vault-mcp@79c0f5f55947319dd7d9f43d1219d79860a72c34 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/pvliesdonk
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@79c0f5f55947319dd7d9f43d1219d79860a72c34 -
Trigger Event:
workflow_dispatch
-
Statement type: