Skip to main content

Personal knowledge base that runs inside Claude Code — structured, searchable, zero extra cost

Project description

kvault

Personal knowledge base that runs inside Claude Code.

pip install knowledgevault[mcp]

kvault gives your coding agent persistent, structured memory. It runs as an MCP server inside Claude Code (or any MCP-compatible tool), using the subscription you already pay for. No extra API keys. No extra cost.

Your agent creates entities (people, projects, notes), deduplicates them with fuzzy matching, and keeps hierarchical summaries in sync — all through 20 MCP tools.

Who is this for?

Developers using Claude Code, OpenAI Codex, Cursor, VS Code + Copilot, or any MCP-compatible tool who want their agent to remember things between sessions — contacts, projects, meeting notes, research — in a structured, searchable format.

What makes it different?

kvault Anthropic memory server Notion AI / Mem.ai obsidian-claude-pkm
Structure Hierarchical entities with dedup Flat JSON Rich docs, flat search Obsidian vault
Agent-native 20 MCP tools, built for agents 4 tools, basic Chat sidebar Template, not runtime
Cost $0 (uses existing subscription) $0 $12-20/mo extra $0
Deduplication Fuzzy name + alias + email domain None None Manual
Summaries Auto-propagating hierarchy None AI-generated Manual

Quickstart (60 seconds)

# 1. Install
pip install knowledgevault[mcp]

# 2. Create a knowledge base
kvault init my_kb --name "Your Name"

# 3. Verify it's clean
kvault check --kb-root my_kb

Add the MCP server to your AI tool's config:

Claude Code (.claude/settings.json):

{
  "mcpServers": {
    "kvault": { "command": "kvault-mcp" }
  }
}

OpenAI Codex (.codex/config.toml):

[mcp_servers.kvault]
command = "kvault-mcp"

Cursor (.cursor/mcp.json):

{
  "mcpServers": {
    "kvault": { "command": "kvault-mcp" }
  }
}

VS Code + GitHub Copilot (.vscode/mcp.json):

{
  "servers": {
    "kvault": { "command": "kvault-mcp", "type": "stdio" }
  }
}

Windsurf (~/.codeium/windsurf/mcp_config.json):

{
  "mcpServers": {
    "kvault": { "command": "kvault-mcp" }
  }
}

Then tell your agent: "Initialize the knowledge base at ./my_kb" — it will call kvault_init and you're up.

Try it: import your ChatGPT history

The best way to see kvault in action is to point it at data you already have. ChatGPT lets you export your entire conversation history — years of questions, people mentioned, projects discussed, decisions made — and Claude Code + kvault can turn it into a structured, searchable knowledge base in minutes.

1. Export your ChatGPT data

Go to ChatGPT → Settings → Data controls → Export data. You'll get an email with a zip file containing conversations.json.

2. Unzip it into your KB

unzip chatgpt-export.zip -d my_kb/sources/chatgpt

3. Tell Claude Code to process it

Read through my ChatGPT export in sources/chatgpt/conversations.json.
Extract the people, projects, and ideas I've discussed most frequently.
Create entities for each one in the knowledge base.

Claude Code will use the kvault tools to research each entity (deduplicating as it goes), create structured entries with frontmatter, propagate summaries, and rebuild the index. You'll end up with a browsable, searchable knowledge base built from years of conversations you've already had.

Other great data sources to try:

Source How to get it What you'll extract
ChatGPT history Settings → Export data People, projects, decisions, research threads
Google Contacts Google Takeout (Contacts) Names, emails, phone numbers, notes
iMessage ~/Library/Messages/chat.db (macOS) Relationships, interaction frequency, context
Gmail Google Takeout (Mail) Professional contacts, threads, follow-ups
Meeting notes Any folder of markdown/text files People, action items, decisions
Notion export Notion → Settings → Export Projects, notes, wikis

The pattern is always the same: drop the data into sources/, tell your agent to process it, and let kvault handle deduplication and structure.

What happens next

Every time your agent processes new information, it follows a 6-step workflow:

  1. Research — Search index for existing entities (fuzzy name, alias, email domain matching)
  2. Decide — Create, update, or skip based on match confidence
  3. Write — Create/update entity with YAML frontmatter (_summary.md)
  4. Propagate — Update all ancestor _summary.md files so summaries stay in sync
  5. Log — Add entry to journal/YYYY-MM/log.md
  6. Rebuild — Rebuild the search index

What an entity looks like

Each entity is a directory with a single _summary.md file containing YAML frontmatter:

---
created: 2026-02-06
updated: 2026-02-06
source: manual
aliases: [Sarah Chen, sarah@anthropic.com]
email: sarah@anthropic.com
relationship_type: colleague
---
# Sarah Chen

Research scientist at Anthropic working on causal discovery.

## Background
Met at NeurIPS 2025. Collaborator on interpretability project.

## Interactions
- 2026-02-06: Coffee meeting — discussed causal representation learning

## Follow-ups
- [ ] Share CJE paper draft

Required frontmatter: source, aliases (the MCP tools set created/updated automatically)

What a knowledge base looks like

my_kb/
├── _summary.md                          # Root: executive overview
├── people/
│   ├── _summary.md                      # "12 contacts across 3 categories"
│   ├── family/
│   │   ├── _summary.md
│   │   └── mom/
│   │       └── _summary.md
│   ├── friends/
│   │   ├── _summary.md
│   │   └── alex_rivera/
│   │       └── _summary.md
│   └── contacts/
│       ├── _summary.md
│       ├── sarah_chen/
│       │   └── _summary.md
│       └── james_park/
│           └── _summary.md
├── projects/
│   ├── _summary.md
│   └── cje_paper/
│       └── _summary.md
├── journal/
│   └── 2026-02/
│       └── log.md
└── .kvault/
    ├── index.db                         # Entity search index
    └── logs.db                          # Observability

Every directory with a _summary.md is a node. Summaries at each level capture the semantic landscape of their children.

MCP tools (20)

Category Tools
Init kvault_init, kvault_status
Index kvault_search, kvault_find_by_alias, kvault_find_by_email_domain, kvault_rebuild_index
Entity kvault_read_entity, kvault_write_entity, kvault_list_entities, kvault_delete_entity, kvault_move_entity
Summary kvault_read_summary, kvault_write_summary, kvault_get_parent_summaries, kvault_propagate_all
Research kvault_research
Workflow kvault_log_phase, kvault_write_journal, kvault_validate_transition
Validation kvault_validate_kb

Python API

kvault also exposes a Python API for programmatic use:

from pathlib import Path
from kvault import EntityIndex, SimpleStorage, EntityResearcher, ObservabilityLogger

# Initialize
kg_root = Path("my_kb")
index = EntityIndex(kg_root / ".kvault" / "index.db")
storage = SimpleStorage(kg_root)
researcher = EntityResearcher(index)

# Research existing entities
matches = researcher.research("Sarah Chen", email="sarah@anthropic.com")
action, target, confidence = researcher.suggest_action("Sarah Chen")

# Write entity
storage.create_entity("people/contacts/sarah_chen", {
    "created": "2026-02-06",
    "updated": "2026-02-06",
    "source": "manual",
    "aliases": ["Sarah Chen", "sarah@anthropic.com"],
}, summary="# Sarah Chen\n\nResearch scientist at Anthropic.")

# Update index
index.rebuild(kg_root)

Integrity hook

Catch stale summaries before each prompt by adding to .claude/settings.json:

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "type": "command",
        "command": "kvault check --kb-root /absolute/path/to/my_kb"
      }
    ]
  }
}

Development

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Lint, format, type-check
ruff check . && black . && mypy .

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

knowledgevault-0.4.0.tar.gz (44.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

knowledgevault-0.4.0-py3-none-any.whl (40.4 kB view details)

Uploaded Python 3

File details

Details for the file knowledgevault-0.4.0.tar.gz.

File metadata

  • Download URL: knowledgevault-0.4.0.tar.gz
  • Upload date:
  • Size: 44.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for knowledgevault-0.4.0.tar.gz
Algorithm Hash digest
SHA256 55f55a8644f36ec5dbc8757c7462cc4cb1c0f72aac7108603062ccc532590552
MD5 5efa1f4154bb11f31359bb5267fc4b56
BLAKE2b-256 365fd200e9270efafeafdff0f4e1ccea9710a6271d7f52201953e14bd23280cd

See more details on using hashes here.

Provenance

The following attestation bundles were made for knowledgevault-0.4.0.tar.gz:

Publisher: publish.yml on cimo-labs/kvault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file knowledgevault-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: knowledgevault-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 40.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for knowledgevault-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7a5ed6da6703cb524da805f0c5d9c7b9ab5c13539f7206242a2ccbd96fe453f4
MD5 52c33a524e5557127122a7dbdc6799fd
BLAKE2b-256 6259248846f487dae297ee415301f082637216e7d235ec38693a50dc63e36702

See more details on using hashes here.

Provenance

The following attestation bundles were made for knowledgevault-0.4.0-py3-none-any.whl:

Publisher: publish.yml on cimo-labs/kvault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page