Skip to main content

DuckDB-backed MCP memory server for Obsidian vaults — structured search, read, and write access for AI coding agents.

Project description

DuckBrain

DuckBrain

DuckDB-backed MCP memory server for Obsidian vaults. Gives AI coding agents structured read and write access to your personal wiki — with full-text search, frontmatter-aware indexing, and automatic index/log updates. Built on the principle that your vault filesystem should be the single source of truth, not a database hidden behind an API.

What it solves

Existing agent memory tools (MemSearch, Open Brain, Mem0, Supermemory) treat memory as unstructured text blobs. If you maintain a Karpathy-style LLM wiki in Obsidian with typed pages (entities, concepts, sources, synthesis), YAML frontmatter, tags, and wikilinks — none of those tools understand your vault's structure.

DuckBrain fills that gap. It reads your vault as-is and writes new pages following your vault's schema, so your wiki stays a single source of truth on the filesystem.

How it works (Architecture)

┌──────────────────┐     MCP stdio     ┌─────────────────────────────────┐
│    AI Agent      │ ◄──────────────►  │      DuckBrain MCP Server       │
│                  │                   │                                 │
│  Claude Code     │                   │  vault_info  ──┐                │
│  OpenCode        │                   │  vault_search ─┤  DuckDB FTS    │
│  Cursor          │                   │  vault_read  ──┤  Filesystem    │
│  Hermes          │                   │  vault_write ──┘  Filesystem    │
└──────────────────┘                   └────────┬────────┬───────────────┘
                                                │        │
                    query ┌─────────────────────┘        └── read/write ──┐
             (full index) ▼                                               ▼  (single file)
         ┌──────────────────────┐                              ┌───────────────────────────┐
         │  DuckDB (in-memory)  │                              │    Your Obsidian Vault    │
         │                      │                              │                           │
         │  pages (in-memory    │    rebuilt from scratch      │  wiki/entities/           │
         │  rebuilt every search)│    on every query           │  wiki/concepts/           │
         │  ┌───────────────┐   │                              │  wiki/sources/            │
         │  │ filepath      │   │                              │  wiki/synthesis/          │
         │  │ title         │   │                              │  daily/                   │
         │  │ kind          │   │                              │  wiki/index.md            │
         │  │ tags          │   │                              │  wiki/log.md              │
         │  │ body          │   │                              │                           │
         │  │ created       │   │                              │  plain markdown on disk   │
         │  │ updated       │   │                              │                           │
         │  └───────────────┘   │                              │                           │
         │                      │                              │                           │
         │  BM25 search query:  │                              │                           │
         │  SELECT ...          │                              │                           │
         │  FROM pages p        │                              │                           │
         │  WHERE fts_match_bm25│                              │                           │
         │    (p.filepath,      │                              │                           │
         │     'segfault')      │                              │                           │
         │  AND kind='concept'  │                              │                           │
         │  ORDER BY score DESC │                              │                           │
         └──────────────────────┘                              └───────────────────────────┘
  • Reads your vault files directly — no index to sync, no watchers, no duplicate storage
  • Searches via DuckDB full-text search (BM25 ranking), rebuilt fresh from disk on every query
  • Writes new pages with correct YAML frontmatter, auto-updating your index and log

Requirements

  • Python 3.10+
  • uv (package manager)
  • An Obsidian vault structured with a wiki/ directory containing:
    • wiki/entities/ — people, orgs, products, tools
    • wiki/concepts/ — ideas, frameworks, theories
    • wiki/sources/ — one summary per ingested source
    • wiki/synthesis/ — cross-cutting analysis
    • wiki/index.md — page catalog with ## Entities, ## Concepts, ## Sources, ## Synthesis sections
    • wiki/log.md — append-only chronological record
  • Pages should use YAML frontmatter: title, item-type, tags, created, updated

This follows the schema defined for LLM wikis. If your vault uses a different structure, DuckBrain works with it — but index/log updates expect the section headers above.

Quick Start

pip install duckbrain

That's it. Now connect your AI agent (see below) — you don't run DuckBrain yourself, the agent spawns it as needed.

(Optional: verify the install by running duckbrain — it'll fail with "VAULT_PATH not set", which confirms it's working.)

Installing from source (for contributors)

git clone https://github.com/timhiebenthal/duckbrain.git
cd duckbrain
uv sync         # installs project + dev dependencies in a virtual environment

This requires uv (the Python package manager used for development). End users should use pip install duckbrain above.

(Optional: to verify the install, run VAULT_PATH="/path/to/your/vault" uv run duckbrain. It will appear to hang — that's correct, it's waiting on stdio. Press Ctrl+C to stop.)

Connecting to Agents

MCP stdio transport means the agent spawns DuckBrain as a child process when it starts. You don't need a separate terminal or a running server. Just add this to your MCP config:

{
  "duckbrain": {
    "command": "uv",
    "args": ["run", "duckbrain"],
    "env": {
      "VAULT_PATH": "/path/to/your/obsidian/vault"
    }
  }
}

Where to put it:

Agent Config file Top-level key
Claude Code ~/.claude/claude_desktop_config.json or .mcp.json mcpServers
OpenCode opencode.json mcp
Cursor .cursor/mcp.json mcpServers
Hermes Agent mcp.json mcpServers

Example for Claude Code:

{
  "mcpServers": {
    "duckbrain": {
      "command": "uv",
      "args": ["run", "duckbrain"],
      "env": {
        "VAULT_PATH": "/path/to/your/obsidian/vault"
      }
    }
  }
}

Tip: Instead of hardcoding the path in every config, set VAULT_PATH once in your shell profile (~/.bashrc, ~/.zshrc, or ~/.config/fish/config.fish) and reference it in the config with your agent's env-var syntax:

  • OpenCode: "VAULT_PATH": "{env:VAULT_PATH}"
  • Claude Code: "VAULT_PATH": "${env:VAULT_PATH}"

Make sure uv is on your PATH.

Auto-Writing Session Learnings

There are two ways to make your agent write learnings to the vault: instructions (works everywhere) or hooks (automatic, agent-native).

Approach 1: Instructions (all agents)

Add this to the appropriate instructions file. The agent reads it on startup and follows it during the session. Tested with OpenCode.

Claude Code — add to CLAUDE.md:

## Session Learnings

After debugging, diving into rabbit holes, or completing significant work,
save what you learned so you don't repeat mistakes:

- Use vault_write(kind="daily", title="...", content="...", tags=["..."])
  to append to today's daily note.
- For reusable knowledge, use vault_write(kind="concept", title="...",
  content="...", tags=["..."]) to create a wiki page.

OpenCode — copy the templates from this repo's opencode/ directory:

cp opencode/LEARNINGS.md ~/.config/opencode/LEARNINGS.md
cp opencode/commands/journal.md ~/.config/opencode/commands/journal.md

Then wire the instruction file into your opencode.json:

"instructions": ["/home/your-user/.config/opencode/LEARNINGS.md"]

The opencode/ directory includes:

  • LEARNINGS.md — pre-response learning guard, trigger table, session rituals, daily note template
  • commands/journal.md/journal slash command to dump session progress + learnings
  • opencode.example.json — full config template with DuckBrain MCP wiring

See opencode/README.md for detailed setup instructions.

Cursor — add to .cursorrules:

## Session Learnings

After debugging or completing work, save learnings via DuckBrain:
- vault_write(kind="daily", title="<summary>", content="<details>", tags=[])
- Use kind="concept" for reusable knowledge.

Approach 2: Hooks (automatic, no prompt engineering needed)

Hooks run shell commands at specific lifecycle points — no instructions needed, they fire deterministically. ⚠️ Not tested with DuckBrain yet.

Claude Code — supports a full hooks system including SessionEnd (fires when a session terminates). Add to .claude/settings.json:

{
  "hooks": {
    "SessionEnd": [
      {
        "type": "command",
        "command": "duckbrain-save-session --transcript-from-stdin"
      }
    ]
  }
}

The SessionEnd hook receives the full transcript on stdin. A wrapper script could pipe it through an LLM to extract learnings, then call vault_write. See agent-memory-mcp for a production example of this pattern.

Cursor — supports hooks including sessionEnd, postToolUse, and stop via .cursor/hooks.json. However, sessionEnd is not available in cloud agents (local IDE only), and MCP execution hooks (beforeMCPExecution/afterMCPExecution) are not yet wired for cloud agents. Usable for local development, not for cloud-based Cursor sessions.

.cursor/hooks.json (local IDE only):

{
  "hooks": {
    "stop": [
      {
        "type": "command",
        "command": "duckbrain-save-session --reason stop"
      }
    ]
  }
}

How It Works

During a session, the agent encounters a problem, debugs it, and resolves it:

> vault_search("duckbrain daily write")
> vault_read(filepath="wiki/...")

Agent debugs, fixes, learns something...

> vault_write(
    kind="daily",
    title="vault_write daily kind doesn't support filepath-based reads",
    content="When vault_search returns filepaths, the agent may try to Read files
    directly. vault_read should accept filepath as well as title to close this gap.",
    tags=["duckbrain", "debugging", "learned"]
  )

The learning is now in daily/2026-05-28.md. Tomorrow when you ask "how do I read vault pages by path?", the agent searches the vault, finds your note, and recalls the solution.

Tools

vault_info

Get a summary of your vault's structure.

> vault_info()
→ {
    entities: 38,
    concepts: 38,
    sources: 33,
    synthesis: 9,
    available_tags: ["agent-memory", "ai", "duckdb", "mcp", ...],
    last_modified: "2026-05-28"
  }

No parameters. Useful for agents to discover what's in the vault before searching.

vault_search

Full-text search over all wiki pages.

> vault_search("agent memory", kind="concept")
→ [
    { title: "Agent Memory Systems", kind: "concept",
      filepath: "wiki/concepts/agent-memory-systems.md",
      snippet: "A 6-level taxonomy of Claude Code memory approaches..." },
    ...
  ]

Parameters:

  • query (required) — search text, BM25-ranked
  • kind (optional) — filter to entity, concept, source, synthesis, or daily
  • tags (optional) — filter by tag substring matches

vault_read

Read a page by title or filepath. Returns full markdown content with metadata.

> vault_read(title="Agent Memory Systems")
→ {
    title: "Agent Memory Systems", kind: "concept",
    filepath: "wiki/concepts/agent-memory-systems.md",
    content: "# Agent Memory Systems\n\nA 6-level taxonomy...",
    tags: ["agent-memory", "taxonomy", "ai"],
    created: "2026-05-28", updated: "2026-05-28"
  }

Parameters:

  • title (optional) — page title to look up (case-insensitive)
  • filepath (optional) — relative path from vault_search results (e.g. wiki/concepts/foo.md)

Use after vault_search to get full page content. Pass filepath from search results directly.

vault_write

Create a new wiki page or append to today's daily note, with automatic index and log updates.

> vault_write(
    kind="concept",
    title="DuckDB FTS Memory",
    content="# DuckDB FTS Memory\n\nHow DuckDB serves as a memory layer...",
    tags=["agent-memory", "duckdb"]
  )
→ { success: true, filepath: "wiki/concepts/duckdb-fts-memory.md" }

For daily notes (session learnings, debugging logs):

> vault_write(
    kind="daily",
    title="Debugging vault_read filepath",
    content="When search returns filepaths, agents try to Read files directly.",
    tags=["duckbrain", "debugging"]
  )
→ { success: true, filepath: "daily/2026-05-28.md" }

For wiki pages (entity|concept|source|synthesis), this automatically:

  1. Writes the markdown file to the correct wiki subdirectory
  2. Generates YAML frontmatter with title, item-type, tags, dates
  3. Appends an entry to wiki/index.md in the right section
  4. Appends a dated entry to wiki/log.md

For daily notes, this automatically:

  1. Appends to daily/YYYY-MM-DD.md (creates the file if today's doesn't exist yet)
  2. No YAML frontmatter — just a ## heading + content
  3. Does NOT update index.md (daily notes aren't wiki pages)
  4. Appends a dated entry to wiki/log.md

Parameters:

  • kind (required) — entity, concept, source, synthesis, or daily
  • title (required) — page title (or section heading for daily entries)
  • content (required) — markdown body (without frontmatter)
  • tags (required) — list of tag strings

Vault Path

Set via the VAULT_PATH environment variable (or the env field in your MCP config — no need for both).

For local development, copy .env.example to .env and set your path:

VAULT_PATH=/path/to/your/obsidian/vault

If you use WSL2 with your vault on Windows, set it to the WSL mount path (e.g., /mnt/c/Users/you/Documents/obsidian/my-vault).

Performance

  • FTS index rebuilt fresh from disk on every query — ~90 pages in under a second
  • Write operations complete in <500ms
  • Everything is in-memory — no persistent DuckDB database file
  • Zero network calls, zero external services

Limitations (v1)

  • No update or delete operations (only create)
  • No vector embeddings or semantic search
  • No page deduplication check before writing
  • ~1s per search at current scale; at 500+ pages, incremental indexing would be needed

Under Consideration

Ideas we're exploring but not committing to yet — as we use the tool and understand what matters, some of these may get built. Open an issue to discuss.

  • Temporal decay (recency bias) — boost search results from recently created or updated pages. Older knowledge fades unless explicitly referenced.
  • Vector embeddings / semantic search — cover the ~20% recall gap that BM25 can't reach (concepts with different wording). Could integrate MemSearch or local embeddings.
  • Update and delete operations — allow agents to edit or remove existing pages, not just create.
  • Incremental indexing — INSERT single pages into the FTS index instead of full rebuild, keeping search fast at 500+ pages.
  • Page deduplication — detect when a page with the same title already exists before writing.

Inspirations

This project stands on the shoulders of several ideas and tools:

  • Andrej Karpathy's LLM wiki pattern — the idea that a personal markdown wiki, co-maintained by humans and AI agents, compounds into a persistent knowledge base. The vault schema (entities, concepts, sources, synthesis, daily log) is directly inspired by this.
  • DuckDB — the embedded analytical database that makes full-text search over flat files viable without a server, index sync, or persistent storage. The decision to use in-memory FTS instead of a vector database was a deliberate trade-off for simplicity.
  • Obsidian — the local-first, markdown-native note-taking tool that treats your files as the truth. DuckBrain exists because Obsidian vaults deserve tooling that respects the filesystem.
  • MemSearch and Open Brain (OB1) — early experiments in cross-tool agent memory that demonstrated the need for structured vault write-back while choosing different architectures. Their strengths and gaps directly informed DuckBrain's design.
  • Agent Memory Systems (6-level taxonomy) — Simon Scrapes' comprehensive comparison of Claude Code memory approaches provided the framework for understanding where DuckBrain fits in the ecosystem (Level 6: cross-tool MCP with dedicated server).
  • trellis-datamodel — the same author's data modeling tool whose CI/CD patterns were borrowed for this project's repository readiness.
  • mondayDB 3 — Solving HTAP for a Trillion-Table System — monday.com's engineering blog on their DuckDB-powered CQRS read serving layer at production scale. Proved that DuckDB in-process with per-tenant file isolation is a viable architecture — the same pattern DuckBrain applies at personal-wiki scale.

The core decision — build, don't integrate — came from a structured comparison of 7 existing tools. All failed on one requirement: vault schema-aware write-back. Rather than fork or extend, DuckBrain started from first principles: what's the simplest thing that gives agents structured read/write access to an Obsidian vault? The answer was DuckDB + MCP + ~500 lines of Python.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

duckbrain-0.2.0.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

duckbrain-0.2.0-py3-none-any.whl (20.8 kB view details)

Uploaded Python 3

File details

Details for the file duckbrain-0.2.0.tar.gz.

File metadata

  • Download URL: duckbrain-0.2.0.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for duckbrain-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6dfcade4c214a8041df4897850a2bbca45704463d2c587e3bed66001e26704fe
MD5 35a2e33c7d64ba61b841df8e48d4242d
BLAKE2b-256 6d6ee6f2e3ad7d1b173b3fb1b6360eb9f4cc12da3c52589e6fe0a715d1a06416

See more details on using hashes here.

Provenance

The following attestation bundles were made for duckbrain-0.2.0.tar.gz:

Publisher: publish.yml on timhiebenthal/duckbrain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file duckbrain-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: duckbrain-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 20.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for duckbrain-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 701a3f958df1155bd643d2f8b07e6dc7be5796f4ad4a3293b4991622b432fd4c
MD5 b48d93d74612c312c809d937e56d9def
BLAKE2b-256 7a33570d0082f8aa5ff9a3ebf92837599eb1c362d9e42b76dd73fe7dc8808747

See more details on using hashes here.

Provenance

The following attestation bundles were made for duckbrain-0.2.0-py3-none-any.whl:

Publisher: publish.yml on timhiebenthal/duckbrain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page