Generic markdown vault MCP server with FTS5 + semantic search
Project description
markdown-vault-mcp
A generic markdown collection MCP server with FTS5 full-text search, semantic vector search, frontmatter-aware indexing, and incremental reindexing.
Point it at a directory of Markdown files (an Obsidian vault, a docs folder, a Zettelkasten) and it exposes search, read, write, and edit tools over the Model Context Protocol.
Features
- Full-text search — SQLite FTS5 with BM25 scoring, porter stemming
- Semantic search — cosine similarity over embedding vectors (Ollama, OpenAI, or Sentence Transformers)
- Hybrid search — Reciprocal Rank Fusion combining FTS5 and vector results
- Frontmatter-aware — indexes YAML frontmatter fields, supports required field enforcement
- Incremental reindexing — hash-based change detection, only re-processes modified files
- Write operations — create, edit, delete, rename documents with automatic index updates
- Git integration — optional auto-commit and push on every write via
GIT_ASKPASS - MCP tools — 13 tools including search, read, write, edit, delete, rename, and admin operations
Installation
From PyPI
pip install markdown-vault-mcp
With optional dependencies:
pip install markdown-vault-mcp[mcp] # FastMCP server
pip install markdown-vault-mcp[embeddings-api] # Ollama/OpenAI embeddings via HTTP
pip install markdown-vault-mcp[all] # MCP + API embeddings (lightweight)
pip install markdown-vault-mcp[all-local] # + sentence-transformers (large, GPU)
From source
git clone https://github.com/pvliesdonk/markdown-vault-mcp.git
cd markdown-vault-mcp
pip install -e ".[all,dev]"
Docker
The Docker image uses the [all] extra (MCP + API embeddings) and does not include sentence-transformers or PyTorch, keeping it lightweight. Use Ollama or OpenAI for embeddings.
docker pull ghcr.io/pvliesdonk/markdown-vault-mcp:latest
Quick Start
As a library
from pathlib import Path
from markdown_vault_mcp import Collection
collection = Collection(source_dir=Path("/path/to/vault"))
results = collection.search("query text", limit=10)
As an MCP server
Set the required environment variable and start the server:
export MARKDOWN_VAULT_MCP_SOURCE_DIR=/path/to/vault
markdown-vault-mcp serve
With Docker Compose
-
Copy an example env file:
cp examples/obsidian-readonly.env .env
-
Edit
.envto setMARKDOWN_VAULT_MCP_SOURCE_DIRto the absolute path of your vault on the host. -
Start the service:
docker compose up -d
-
Check the logs:
docker compose logs -f markdown-vault-mcp
Example env files
| File | Description |
|---|---|
examples/obsidian-readonly.env |
Obsidian vault, read-only, Ollama embeddings |
examples/obsidian-readwrite.env |
Obsidian vault, read-write with git auto-commit |
examples/ifcraftcorpus.env |
Strict frontmatter enforcement, read-only corpus |
For reverse proxy (Traefik) and authentication (mcp-auth-proxy) setup, see docs/deployment.md.
Configuration
All configuration is via environment variables with the MARKDOWN_VAULT_MCP_ prefix.
| Variable | Default | Required | Description |
|---|---|---|---|
MARKDOWN_VAULT_MCP_SERVER_NAME |
markdown-vault-mcp |
No | MCP server name shown to clients (useful for multi-instance setups) |
MARKDOWN_VAULT_MCP_INSTRUCTIONS |
generic description | No | System-level instructions injected into LLM context |
MARKDOWN_VAULT_MCP_SOURCE_DIR |
— | Yes | Path to the markdown vault directory |
MARKDOWN_VAULT_MCP_READ_ONLY |
true |
No | Set to false to enable write operations |
MARKDOWN_VAULT_MCP_INDEX_PATH |
in-memory | No | Path to the SQLite FTS5 index file (set for persistence across restarts) |
MARKDOWN_VAULT_MCP_EMBEDDINGS_PATH |
disabled | No | Path to the numpy embeddings file (required to enable semantic search) |
MARKDOWN_VAULT_MCP_STATE_PATH |
{SOURCE_DIR}/.markdown_vault_mcp/state.json |
No | Path to the change-tracking state file |
MARKDOWN_VAULT_MCP_INDEXED_FIELDS |
— | No | Comma-separated frontmatter fields to index in FTS5 |
MARKDOWN_VAULT_MCP_REQUIRED_FIELDS |
— | No | Comma-separated frontmatter fields required on every document |
MARKDOWN_VAULT_MCP_EXCLUDE |
— | No | Comma-separated glob patterns to exclude (e.g. .obsidian/**,.trash/**) |
MARKDOWN_VAULT_MCP_GIT_TOKEN |
— | No | GitHub PAT for auto-commit and push on writes (via GIT_ASKPASS) |
MARKDOWN_VAULT_MCP_GIT_PUSH_DELAY_S |
30 |
No | Seconds of idle before pushing (0 = push only on shutdown) |
MARKDOWN_VAULT_MCP_OLLAMA_MODEL |
nomic-embed-text |
No | Ollama embedding model name |
MARKDOWN_VAULT_MCP_OLLAMA_CPU_ONLY |
false |
No | Force Ollama to use CPU only |
EMBEDDING_PROVIDER |
auto-detect | No | Embedding provider: ollama, openai, or sentence-transformers (not prefixed) |
OLLAMA_HOST |
http://localhost:11434 |
No | Ollama server URL (not prefixed) |
OPENAI_API_KEY |
— | No | OpenAI API key for OpenAI embedding provider (not prefixed) |
MCP Tools
| Tool | Description |
|---|---|
search |
Hybrid full-text + semantic search with optional frontmatter filters |
read |
Read a document's content by relative path |
write |
Create or overwrite a document (with optional frontmatter) |
edit |
Replace a unique text span in a document |
delete |
Delete a document and its index entries |
rename |
Rename/move a document, updating all index entries |
list_documents |
List all indexed document paths (with optional folder and glob pattern filter) |
list_folders |
List all folder paths in the vault |
list_tags |
List all unique frontmatter tag values |
reindex |
Force a full reindex of the vault |
stats |
Get collection statistics (document count, chunk count, etc.) |
build_embeddings |
Build or rebuild vector embeddings for semantic search |
embeddings_status |
Check embedding provider and index status |
Write tools (write, edit, delete, rename) are only available when MARKDOWN_VAULT_MCP_READ_ONLY=false.
Development
git clone https://github.com/pvliesdonk/markdown-vault-mcp.git
cd markdown-vault-mcp
uv pip install -e ".[all,dev]"
# Run tests
uv run python -m pytest tests/ -x -q
# Lint and format
ruff check src/ tests/
ruff format src/ tests/
# Type check
mypy src/
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file markdown_vault_mcp-1.1.0.tar.gz.
File metadata
- Download URL: markdown_vault_mcp-1.1.0.tar.gz
- Upload date:
- Size: 269.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75cadcde3426df584fc2de5e86410992bef572a8e42b121762b022c956ebb6d2
|
|
| MD5 |
808caf468728c1b45afcd7d8e43dee21
|
|
| BLAKE2b-256 |
e6352d506a0cbe4422e59af9f40c727a9b59dd841146b75d7ff6760a64a90de8
|
Provenance
The following attestation bundles were made for markdown_vault_mcp-1.1.0.tar.gz:
Publisher:
release.yml on pvliesdonk/markdown-vault-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markdown_vault_mcp-1.1.0.tar.gz -
Subject digest:
75cadcde3426df584fc2de5e86410992bef572a8e42b121762b022c956ebb6d2 - Sigstore transparency entry: 1066288843
- Sigstore integration time:
-
Permalink:
pvliesdonk/markdown-vault-mcp@22ffa5f110c5b354f1c9bcbadcfa961fd2fcab3e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/pvliesdonk
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@22ffa5f110c5b354f1c9bcbadcfa961fd2fcab3e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file markdown_vault_mcp-1.1.0-py3-none-any.whl.
File metadata
- Download URL: markdown_vault_mcp-1.1.0-py3-none-any.whl
- Upload date:
- Size: 50.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b22582bc0ec9b2af2c1993a2b628d59ee9613c5fdb7364c757a0cf459d9f9c6
|
|
| MD5 |
da8197276daf19b2b0338d1e468763f7
|
|
| BLAKE2b-256 |
824a53f915d8990f147cf4034bd7c4a7b671488820a1d0a88740a492ac29118d
|
Provenance
The following attestation bundles were made for markdown_vault_mcp-1.1.0-py3-none-any.whl:
Publisher:
release.yml on pvliesdonk/markdown-vault-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markdown_vault_mcp-1.1.0-py3-none-any.whl -
Subject digest:
0b22582bc0ec9b2af2c1993a2b628d59ee9613c5fdb7364c757a0cf459d9f9c6 - Sigstore transparency entry: 1066288934
- Sigstore integration time:
-
Permalink:
pvliesdonk/markdown-vault-mcp@22ffa5f110c5b354f1c9bcbadcfa961fd2fcab3e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/pvliesdonk
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@22ffa5f110c5b354f1c9bcbadcfa961fd2fcab3e -
Trigger Event:
workflow_dispatch
-
Statement type: