DuckDB-backed MCP memory server for Obsidian vaults — structured search, read, and write access for AI coding agents.
Project description
DuckBrain
DuckDB-backed MCP memory server for Obsidian vaults. Gives AI coding agents structured read and write access to your personal wiki — with full-text search, frontmatter-aware indexing, and automatic index/log updates. Built on the principle that your vault filesystem should be the single source of truth, not a database hidden behind an API.
What it solves
Existing agent memory tools (MemSearch, Open Brain, Mem0, Supermemory) treat memory as unstructured text blobs. If you maintain a Karpathy-style LLM wiki in Obsidian with typed pages (entities, concepts, sources, synthesis), YAML frontmatter, tags, and wikilinks — none of those tools understand your vault's structure.
DuckBrain fills that gap. It reads your vault as-is and writes new pages following your vault's schema, so your wiki stays a single source of truth on the filesystem.
How it works (Architecture)
┌──────────────────┐ MCP stdio ┌─────────────────────────────────┐
│ AI Agent │ ◄──────────────► │ DuckBrain MCP Server │
│ │ │ │
│ Claude Code │ │ vault_info ──┐ │
│ OpenCode │ │ vault_search ─┤ DuckDB FTS │
│ Cursor │ │ vault_read ──┤ Filesystem │
│ Hermes │ │ vault_write ──┘ Filesystem │
└──────────────────┘ └────────┬────────┬───────────────┘
│ │
query ┌─────────────────────┘ └── read/write ──┐
(full index) ▼ ▼ (single file)
┌──────────────────────┐ ┌───────────────────────────┐
│ DuckDB (in-memory) │ │ Your Obsidian Vault │
│ │ │ │
│ pages (in-memory │ rebuilt from scratch │ wiki/entities/ │
│ rebuilt every search)│ on every query │ wiki/concepts/ │
│ ┌───────────────┐ │ │ wiki/sources/ │
│ │ filepath │ │ │ wiki/synthesis/ │
│ │ title │ │ │ daily/ │
│ │ kind │ │ │ wiki/index.md │
│ │ tags │ │ │ wiki/log.md │
│ │ body │ │ │ │
│ │ created │ │ │ plain markdown on disk │
│ │ updated │ │ │ │
│ └───────────────┘ │ │ │
│ │ │ │
│ BM25 search query: │ │ │
│ SELECT ... │ │ │
│ FROM pages p │ │ │
│ WHERE fts_match_bm25│ │ │
│ (p.filepath, │ │ │
│ 'segfault') │ │ │
│ AND kind='concept' │ │ │
│ ORDER BY score DESC │ │ │
└──────────────────────┘ └───────────────────────────┘
- Reads your vault files directly — no index to sync, no watchers, no duplicate storage
- Searches via DuckDB full-text search (BM25 ranking), rebuilt fresh from disk on every query
- Writes new pages with correct YAML frontmatter, auto-updating your index and log
Requirements
- Python 3.10+
- uv (package manager)
- An Obsidian vault structured with a
wiki/directory containing:wiki/entities/— people, orgs, products, toolswiki/concepts/— ideas, frameworks, theorieswiki/sources/— one summary per ingested sourcewiki/synthesis/— cross-cutting analysiswiki/index.md— page catalog with## Entities,## Concepts,## Sources,## Synthesissectionswiki/log.md— append-only chronological record
- Pages should use YAML frontmatter:
title,item-type,tags,created,updated
This follows the schema defined for LLM wikis. If your vault uses a different structure, DuckBrain works with it — but index/log updates expect the section headers above.
Quick Start
pip install duckbrain
That's it. Now connect your AI agent (see below) — you don't run DuckBrain yourself, the agent spawns it as needed.
(Optional: verify the install by running duckbrain — it'll fail with "VAULT_PATH not set", which confirms it's working.)
Installing from source (for contributors)
git clone https://github.com/timhiebenthal/duckbrain.git
cd duckbrain
uv sync # installs project + dev dependencies in a virtual environment
This requires uv (the Python package manager used for development). End users should use pip install duckbrain above.
(Optional: to verify the install, run VAULT_PATH="/path/to/your/vault" uv run duckbrain. It will appear to hang — that's correct, it's waiting on stdio. Press Ctrl+C to stop.)
Connecting to Agents
MCP stdio transport means the agent spawns DuckBrain as a child process when it starts. You don't need a separate terminal or a running server. Just add this to your MCP config:
{
"duckbrain": {
"command": "uv",
"args": ["run", "duckbrain"],
"env": {
"VAULT_PATH": "/path/to/your/obsidian/vault"
}
}
}
Where to put it:
| Agent | Config file | Top-level key |
|---|---|---|
| Claude Code | ~/.claude/claude_desktop_config.json or .mcp.json |
mcpServers |
| OpenCode | opencode.json |
mcp |
| Cursor | .cursor/mcp.json |
mcpServers |
| Hermes Agent | mcp.json |
mcpServers |
Example for Claude Code:
{
"mcpServers": {
"duckbrain": {
"command": "uv",
"args": ["run", "duckbrain"],
"env": {
"VAULT_PATH": "/path/to/your/obsidian/vault"
}
}
}
}
Tip: Instead of hardcoding the path in every config, set
VAULT_PATHonce in your shell profile (~/.bashrc,~/.zshrc, or~/.config/fish/config.fish) and reference it in the config with your agent's env-var syntax:
- OpenCode:
"VAULT_PATH": "{env:VAULT_PATH}"- Claude Code:
"VAULT_PATH": "${env:VAULT_PATH}"
Make sure uv is on your PATH.
Auto-Writing Session Learnings
There are two ways to make your agent write learnings to the vault: instructions (works everywhere) or hooks (automatic, agent-native).
Approach 1: Instructions (all agents)
Add this to the appropriate instructions file. The agent reads it on startup and follows it during the session. Tested with OpenCode.
Claude Code — add to CLAUDE.md:
## Session Learnings
After debugging, diving into rabbit holes, or completing significant work,
save what you learned so you don't repeat mistakes:
- Use vault_write(kind="daily", title="...", content="...", tags=["..."])
to append to today's daily note.
- For reusable knowledge, use vault_write(kind="concept", title="...",
content="...", tags=["..."]) to create a wiki page.
OpenCode — add to your config's instructions field (opencode.json):
"instructions": ["~/.config/opencode/LEARNINGS.md"]
Then create ~/.config/opencode/LEARNINGS.md (or wherever you prefer — any path the config can reach):
## Session Learnings
When you encounter problems, debug issues, or discover non-obvious solutions,
save the learning to the vault so it's available in future sessions:
- Append to today's daily note:
vault_write(kind="daily", title="short summary", content="what you learned", tags=["debugging", "learned"])
- For reusable concepts/patterns worth revisiting:
vault_write(kind="concept", title="Concept Name", content="explanation", tags=["relevant", "tags"])
Do this proactively — don't wait to be asked. A learning saved is a bug not repeated.
Cursor — add to .cursorrules:
## Session Learnings
After debugging or completing work, save learnings via DuckBrain:
- vault_write(kind="daily", title="<summary>", content="<details>", tags=[])
- Use kind="concept" for reusable knowledge.
Approach 2: Hooks (automatic, no prompt engineering needed)
Hooks run shell commands at specific lifecycle points — no instructions needed, they fire deterministically. ⚠️ Not tested with DuckBrain yet.
Claude Code — supports a full hooks system including SessionEnd (fires when a session terminates). Add to .claude/settings.json:
{
"hooks": {
"SessionEnd": [
{
"type": "command",
"command": "duckbrain-save-session --transcript-from-stdin"
}
]
}
}
The SessionEnd hook receives the full transcript on stdin. A wrapper script could pipe it through an LLM to extract learnings, then call vault_write. See agent-memory-mcp for a production example of this pattern.
Cursor — supports hooks including sessionEnd, postToolUse, and stop via .cursor/hooks.json. However, sessionEnd is not available in cloud agents (local IDE only), and MCP execution hooks (beforeMCPExecution/afterMCPExecution) are not yet wired for cloud agents. Usable for local development, not for cloud-based Cursor sessions.
.cursor/hooks.json (local IDE only):
{
"hooks": {
"stop": [
{
"type": "command",
"command": "duckbrain-save-session --reason stop"
}
]
}
}
How It Works
During a session, the agent encounters a problem, debugs it, and resolves it:
> vault_search("duckbrain daily write")
> vault_read(filepath="wiki/...")
Agent debugs, fixes, learns something...
> vault_write(
kind="daily",
title="vault_write daily kind doesn't support filepath-based reads",
content="When vault_search returns filepaths, the agent may try to Read files
directly. vault_read should accept filepath as well as title to close this gap.",
tags=["duckbrain", "debugging", "learned"]
)
The learning is now in daily/2026-05-28.md. Tomorrow when you ask "how do I read vault pages by path?", the agent searches the vault, finds your note, and recalls the solution.
Tools
vault_info
Get a summary of your vault's structure.
> vault_info()
→ {
entities: 38,
concepts: 38,
sources: 33,
synthesis: 9,
available_tags: ["agent-memory", "ai", "duckdb", "mcp", ...],
last_modified: "2026-05-28"
}
No parameters. Useful for agents to discover what's in the vault before searching.
vault_search
Full-text search over all wiki pages.
> vault_search("agent memory", kind="concept")
→ [
{ title: "Agent Memory Systems", kind: "concept",
filepath: "wiki/concepts/agent-memory-systems.md",
snippet: "A 6-level taxonomy of Claude Code memory approaches..." },
...
]
Parameters:
query(required) — search text, BM25-rankedkind(optional) — filter toentity,concept,source,synthesis, ordailytags(optional) — filter by tag substring matches
vault_read
Read a page by title or filepath. Returns full markdown content with metadata.
> vault_read(title="Agent Memory Systems")
→ {
title: "Agent Memory Systems", kind: "concept",
filepath: "wiki/concepts/agent-memory-systems.md",
content: "# Agent Memory Systems\n\nA 6-level taxonomy...",
tags: ["agent-memory", "taxonomy", "ai"],
created: "2026-05-28", updated: "2026-05-28"
}
Parameters:
title(optional) — page title to look up (case-insensitive)filepath(optional) — relative path from vault_search results (e.g.wiki/concepts/foo.md)
Use after vault_search to get full page content. Pass filepath from search results directly.
vault_write
Create a new wiki page or append to today's daily note, with automatic index and log updates.
> vault_write(
kind="concept",
title="DuckDB FTS Memory",
content="# DuckDB FTS Memory\n\nHow DuckDB serves as a memory layer...",
tags=["agent-memory", "duckdb"]
)
→ { success: true, filepath: "wiki/concepts/duckdb-fts-memory.md" }
For daily notes (session learnings, debugging logs):
> vault_write(
kind="daily",
title="Debugging vault_read filepath",
content="When search returns filepaths, agents try to Read files directly.",
tags=["duckbrain", "debugging"]
)
→ { success: true, filepath: "daily/2026-05-28.md" }
For wiki pages (entity|concept|source|synthesis), this automatically:
- Writes the markdown file to the correct wiki subdirectory
- Generates YAML frontmatter with title, item-type, tags, dates
- Appends an entry to
wiki/index.mdin the right section - Appends a dated entry to
wiki/log.md
For daily notes, this automatically:
- Appends to
daily/YYYY-MM-DD.md(creates the file if today's doesn't exist yet) - No YAML frontmatter — just a
## heading+ content - Does NOT update index.md (daily notes aren't wiki pages)
- Appends a dated entry to
wiki/log.md
Parameters:
kind(required) —entity,concept,source,synthesis, ordailytitle(required) — page title (or section heading for daily entries)content(required) — markdown body (without frontmatter)tags(required) — list of tag strings
Vault Path
Set via the VAULT_PATH environment variable (or the env field in your MCP config — no need for both).
For local development, copy .env.example to .env and set your path:
VAULT_PATH=/path/to/your/obsidian/vault
If you use WSL2 with your vault on Windows, set it to the WSL mount path (e.g., /mnt/c/Users/you/Documents/obsidian/my-vault).
Performance
- FTS index rebuilt fresh from disk on every query — ~90 pages in under a second
- Write operations complete in <500ms
- Everything is in-memory — no persistent DuckDB database file
- Zero network calls, zero external services
Limitations (v1)
- No update or delete operations (only create)
- No vector embeddings or semantic search
- No page deduplication check before writing
- ~1s per search at current scale; at 500+ pages, incremental indexing would be needed
Under Consideration
Ideas we're exploring but not committing to yet — as we use the tool and understand what matters, some of these may get built. Open an issue to discuss.
- Temporal decay (recency bias) — boost search results from recently created or updated pages. Older knowledge fades unless explicitly referenced.
- Vector embeddings / semantic search — cover the ~20% recall gap that BM25 can't reach (concepts with different wording). Could integrate MemSearch or local embeddings.
- Update and delete operations — allow agents to edit or remove existing pages, not just create.
- Incremental indexing — INSERT single pages into the FTS index instead of full rebuild, keeping search fast at 500+ pages.
- Page deduplication — detect when a page with the same title already exists before writing.
Inspirations
This project stands on the shoulders of several ideas and tools:
- Andrej Karpathy's LLM wiki pattern — the idea that a personal markdown wiki, co-maintained by humans and AI agents, compounds into a persistent knowledge base. The vault schema (entities, concepts, sources, synthesis, daily log) is directly inspired by this.
- DuckDB — the embedded analytical database that makes full-text search over flat files viable without a server, index sync, or persistent storage. The decision to use in-memory FTS instead of a vector database was a deliberate trade-off for simplicity.
- Obsidian — the local-first, markdown-native note-taking tool that treats your files as the truth. DuckBrain exists because Obsidian vaults deserve tooling that respects the filesystem.
- MemSearch and Open Brain (OB1) — early experiments in cross-tool agent memory that demonstrated the need for structured vault write-back while choosing different architectures. Their strengths and gaps directly informed DuckBrain's design.
- Agent Memory Systems (6-level taxonomy) — Simon Scrapes' comprehensive comparison of Claude Code memory approaches provided the framework for understanding where DuckBrain fits in the ecosystem (Level 6: cross-tool MCP with dedicated server).
- trellis-datamodel — the same author's data modeling tool whose CI/CD patterns were borrowed for this project's repository readiness.
- mondayDB 3 — Solving HTAP for a Trillion-Table System — monday.com's engineering blog on their DuckDB-powered CQRS read serving layer at production scale. Proved that DuckDB in-process with per-tenant file isolation is a viable architecture — the same pattern DuckBrain applies at personal-wiki scale.
The core decision — build, don't integrate — came from a structured comparison of 7 existing tools. All failed on one requirement: vault schema-aware write-back. Rather than fork or extend, DuckBrain started from first principles: what's the simplest thing that gives agents structured read/write access to an Obsidian vault? The answer was DuckDB + MCP + ~500 lines of Python.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file duckbrain-0.1.1.tar.gz.
File metadata
- Download URL: duckbrain-0.1.1.tar.gz
- Upload date:
- Size: 16.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9582f41cbe929aeffbcbc3bb929e7986f1d324e02a4326d12e677d628c22fcc2
|
|
| MD5 |
c2f5d8b1a302e022c345e837fbfdc9bf
|
|
| BLAKE2b-256 |
0c70301a742aae40968b9ef45df17ab5da1f12846f66fa7e14b6c8d457c7a0eb
|
Provenance
The following attestation bundles were made for duckbrain-0.1.1.tar.gz:
Publisher:
publish.yml on timhiebenthal/duckbrain
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
duckbrain-0.1.1.tar.gz -
Subject digest:
9582f41cbe929aeffbcbc3bb929e7986f1d324e02a4326d12e677d628c22fcc2 - Sigstore transparency entry: 1661389494
- Sigstore integration time:
-
Permalink:
timhiebenthal/duckbrain@37cbce031a3662954a63c561a020f5428be9d808 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/timhiebenthal
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@37cbce031a3662954a63c561a020f5428be9d808 -
Trigger Event:
push
-
Statement type:
File details
Details for the file duckbrain-0.1.1-py3-none-any.whl.
File metadata
- Download URL: duckbrain-0.1.1-py3-none-any.whl
- Upload date:
- Size: 20.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
270845220c7005b140afb3e84088333c0f6b959eae0bc6ef5ed17f64d146e848
|
|
| MD5 |
042214db6a5672518015ff15d0b67f50
|
|
| BLAKE2b-256 |
f4f398df4924830f4a943ee711a3e28e6df7a313c086161dafd0f244b6c69da9
|
Provenance
The following attestation bundles were made for duckbrain-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on timhiebenthal/duckbrain
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
duckbrain-0.1.1-py3-none-any.whl -
Subject digest:
270845220c7005b140afb3e84088333c0f6b959eae0bc6ef5ed17f64d146e848 - Sigstore transparency entry: 1661389577
- Sigstore integration time:
-
Permalink:
timhiebenthal/duckbrain@37cbce031a3662954a63c561a020f5428be9d808 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/timhiebenthal
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@37cbce031a3662954a63c561a020f5428be9d808 -
Trigger Event:
push
-
Statement type: