Local knowledge base CLI — hybrid search over markdown files with AI embeddings

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

tenfourty

These details have not been verified by PyPI

Project description

kbx — Local Knowledge Base with Hybrid Search

Give your AI agents persistent memory. Index your markdown notes, meeting transcripts, and documentation into a hybrid search engine. Search with keywords or natural language. Everything runs locally — your data never leaves your machine.

kbx combines SQLite FTS5 full-text search with LanceDB vector search using Qwen3 embeddings — all on-device, with Apple Silicon acceleration via MLX.

You can read more about kbx's progress in the CHANGELOG.

Quick Start

# Install
pip install kbx                        # core CLI + FTS5 search
pip install "kbx[search]"              # + vector search (Qwen3 embeddings)
pip install "kbx[search,mlx]"          # + Apple Silicon acceleration

# Set up a knowledge base
kbx init                               # create kbx.toml in the current directory

# Index your markdown files
kbx index run                          # index everything under memory/
kbx index run --no-embed               # text-only index (fast, no model needed)

# Search
kbx search "quarterly planning"        # hybrid search (FTS5 + vector)
kbx search "quarterly planning" --fast # keyword-only (~instant, no model needed)
kbx search "MFA rollout" --json        # structured output for scripts

# Browse
kbx view "memory/notes/decisions.md"   # read a document
kbx view "#a1b2c3"                     # by content-hash prefix
kbx list --type notes --from 2026-01-01

Using with AI Agents

kbx is built for agentic workflows. The --json output format, structured error responses, and built-in agent playbook make it a natural fit for AI assistants.

# Orient: get a compressed overview of all entities (~2K tokens)
kbx context

# Search with structured output
kbx search "authentication" --fast --json --limit 5

# Look up a person
kbx person find "Alice" --json

# Timeline of everything mentioning a project
kbx person timeline "Cloud Migration" --from 2026-01-01 --json

# Take notes that persist across sessions
kbx memory add "Decision: use Postgres" --tags decision,infra --pin
kbx memory add "Promoted to Staff" --entity "Bob"

# Pin important docs to the context window
kbx pin "memory/notes/priorities.md"

When you run kbx --help, it prints an agent playbook alongside the standard CLI help — a complete reference for AI agents to self-orient and use the knowledge base effectively.

MCP Server

kbx exposes an MCP server for tighter integration with Claude Desktop, Claude Code, Cursor, and other MCP-compatible tools.

Tools exposed:

kb_search — Hybrid or FTS-only search with date/tag filters
kb_person_find — Entity lookup by name, alias, or partial match
kb_person_timeline — Chronological document list for an entity
kb_view — Retrieve a document by path, glob, or #hash
kb_context — Compressed entity index for session orientation
kb_memory_add — Create notes or record facts about entities
kb_pin / kb_unpin — Pin documents to the context window
kb_usage — Index status and usage instructions

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "kbx": {
      "command": "/Users/YOU/.local/bin/kbx",
      "args": ["mcp"]
    }
  }
}

Note: Claude Desktop does not inherit your shell PATH. Use the full path to kbx — find it with which kbx (typically ~/.local/bin/kbx when installed via uv tool install).

Claude Code (.claude/settings.local.json):

{
  "mcpServers": {
    "kbx": {
      "command": "kbx",
      "args": ["mcp"],
      "type": "stdio"
    }
  }
}

See MCP plugin docs for full tool parameter reference.

Python API

Use kbx as a library in your own applications:

from kb import KnowledgeBase

with KnowledgeBase(thread_safe=True) as kb:
    # Search
    results = kb.search("cloud migration")

    # Entities
    people = kb.list_entities(entity_type="person")
    alice = kb.get_entity("Alice")
    timeline = kb.get_entity_timeline("Alice")

    # Context
    ctx = kb.context()

    # Index
    kb.index()

The KnowledgeBase class manages the full lifecycle — DB connections, embedder, auto-reindexing of stale files. All methods return Pydantic models.

See architecture docs for the full API surface.

Architecture

Write-through principle: Markdown files are the source of truth. All data writes go to flat files first; the database is a derived index rebuilt from those files. The DB is disposable — delete it and re-index.

Markdown files (source of truth)
        │
        ▼
┌─────────────────────────────────────────────────────┐
│                   Source Adapters                     │
│  meetings.py — walk memory/meetings/YYYY/MM/DD/     │
│  memory.py   — walk memory/people/, projects/, ...  │
└────────────────────────┬────────────────────────────┘
                         │ ParsedDocument
                         ▼
┌─────────────────────────────────────────────────────┐
│                      Indexer                          │
│  chunk → embed → store → link entities               │
└──────────┬──────────────────────────┬───────────────┘
           │                          │
           ▼                          ▼
┌──────────────────┐    ┌─────────────────────────────┐
│     SQLite        │    │         LanceDB              │
│  docs, chunks,    │    │  Qwen3-Embedding-0.6B        │
│  FTS5, entities,  │    │  1024-dim vectors             │
│  facts, mentions  │    │  float32, instruction-aware   │
└──────────────────┘    └─────────────────────────────┘
           │                          │
           └────────────┬─────────────┘
                        ▼
┌─────────────────────────────────────────────────────┐
│                  Hybrid Search                       │
│  FTS5 (BM25) + Vector → RRF Fusion → Recency Weight │
└─────────────────────────────────────────────────────┘

Search

kbx supports two search modes:

Mode	Flag	Speed	Method
Fast	`--fast`	~instant	FTS5 keyword search only
Hybrid	(default)	~2s	FTS5 + vector search + RRF fusion

Hybrid search uses Reciprocal Rank Fusion (RRF) to combine keyword and semantic results, with a 90-day half-life recency weight. A strong-signal fast path skips vector search entirely when FTS5 produces a high-confidence match.

Score interpretation: 0.8+ strong | 0.5–0.8 worth reading | <0.5 noise

See search docs for the full pipeline, score normalisation, and fusion strategy.

Entity System

kbx automatically links people, projects, teams, and glossary terms to your documents:

kbx person find "Alice" --json        # profile + linked documents
kbx person timeline "Alice"           # chronological mentions
kbx person create "Bob" --role "SRE Lead" --team "Platform"
kbx project find "Cloud Migration"    # project profile + linked docs
kbx entity stale --days 30            # entities not mentioned recently

Entities are seeded from memory/people/*.md and memory/projects/*.md files, then linked to documents via five-tier matching: YAML tags → title participants → title substrings → source IDs → content name matching.

See entity docs for the full linking pipeline.

Sync & Ingest

Pull meeting transcripts from external sources:

# Granola API sync
kbx sync granola --since 2026-01-01

# Notion AI Meeting Notes sync
kbx sync notion --since 2026-01-01

# Granola zip export ingest
kbx ingest export.zip

# View and edit synced meeting notes
kbx granola view <calendar-uid>
kbx granola edit <calendar-uid> --append "Action: follow up with Alice"

Sync is incremental — only new or updated meetings are fetched. Attendees are automatically matched to existing entities. See Granola plugin docs for configuration.

Configuration

kbx looks for configuration in this order:

$KBX_CONFIG environment variable
./kbx.toml in the current directory (walk up from CWD)
~/.config/kbx/config.toml

Run kbx init to generate a starter config.

Optional Extras

Extra	What it adds
`search`	LanceDB + sentence-transformers + NumPy for vector search
`mlx`	MLX backend for faster embeddings on Apple Silicon
`mcp`	MCP server for AI tool integration
`all`	Everything above plus test and dev dependencies

Install with: pip install "kbx[search,mlx,mcp]"

Requires Python 3.10+.

Data Storage

Index stored in the data directory (configurable via kbx.toml or $KB_DATA_DIR):

kbx-data/
├── metadata.db        # SQLite — documents, chunks, FTS5, entities, facts
└── vectors/           # LanceDB — Qwen3 embedding vectors (1024-dim)

The database is a derived index. Delete it and kbx index run to rebuild from your markdown files.

Development

git clone https://github.com/tenfourty/kbx.git
cd kbx
uv sync --all-extras
uv run pre-commit install
uv run pytest -x -q --cov           # 1361 tests, 90%+ coverage
uv run mypy src/                     # strict mode

Quick CI check locally:

make ci                              # mirror exact GitHub CI pipeline
make fix                             # auto-fix lint + format issues

See CONTRIBUTING.md for guidelines and testing docs for the test strategy.

Documentation

Doc	What it covers
Architecture	System design, data flow, module dependencies, Python API
Search	FTS5 + vector + RRF fusion pipeline, score normalisation
Entities	Entity seeding, five-tier linking, disambiguation
Indexing	Walk → chunk → embed → store pipeline
Chunking	Markdown-aware chunking strategy
CLI Reference	All commands and options
Output Formatting	JSON, table, CSV, JSONL, jq, field selection
Context Layer	Compressed entity index for AI agents
Testing	Test strategy, fixtures, markers
MCP Plugin	MCP server tools and resources
MLX Plugin	Apple Silicon embedding acceleration
Granola Plugin	Meeting transcript sync (view, edit, push)
Notion Plugin	Notion AI Meeting Notes sync
Integration	Ingest, migrations, search quality

License

Apache-2.0

Project details

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

tenfourty

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.62

Mar 16, 2026

0.1.61

Mar 16, 2026

0.1.60

Mar 16, 2026

0.1.59

Mar 16, 2026

0.1.58

Mar 16, 2026

0.1.57

Mar 16, 2026

0.1.56

Mar 16, 2026

This version

0.1.55

Mar 16, 2026

0.1.54

Mar 16, 2026

0.1.53

Mar 12, 2026

0.1.52

Mar 11, 2026

0.1.51

Mar 11, 2026

0.1.50

Mar 11, 2026

0.1.49

Mar 11, 2026

0.1.48

Mar 11, 2026

0.1.47

Mar 11, 2026

0.1.46

Mar 11, 2026

0.1.45

Mar 9, 2026

0.1.44

Mar 9, 2026

0.1.43

Mar 9, 2026

0.1.42

Mar 9, 2026

0.1.41

Mar 9, 2026

0.1.35

Mar 8, 2026

0.1.34

Mar 8, 2026

0.1.33

Mar 6, 2026

0.1.32

Mar 6, 2026

0.1.31

Mar 6, 2026

0.1.30

Mar 6, 2026

0.1.29

Mar 6, 2026

0.1.28

Mar 3, 2026

0.1.27

Mar 3, 2026

0.1.26

Mar 3, 2026

0.1.25

Mar 3, 2026

0.1.24

Mar 3, 2026

0.1.23

Mar 3, 2026

0.1.22

Mar 3, 2026

0.1.21

Mar 3, 2026

0.1.20

Mar 3, 2026

0.1.19

Mar 3, 2026

0.1.18

Mar 3, 2026

0.1.17

Mar 3, 2026

0.1.16

Mar 2, 2026

0.1.15

Mar 2, 2026

0.1.14

Mar 2, 2026

0.1.13

Mar 2, 2026

0.1.12

Mar 2, 2026

0.1.11

Mar 1, 2026

0.1.10

Mar 1, 2026

0.1.9

Mar 1, 2026

0.1.8

Feb 25, 2026

0.1.7

Feb 25, 2026

0.1.6

Feb 24, 2026

0.1.5

Feb 24, 2026

0.1.4

Feb 24, 2026

0.1.3

Feb 23, 2026

0.1.2

Feb 23, 2026

0.1.1

Feb 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kbx-0.1.55.tar.gz (606.7 kB view details)

Uploaded Mar 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kbx-0.1.55-py3-none-any.whl (155.6 kB view details)

Uploaded Mar 16, 2026 Python 3

File details

Details for the file kbx-0.1.55.tar.gz.

File metadata

Download URL: kbx-0.1.55.tar.gz
Upload date: Mar 16, 2026
Size: 606.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kbx-0.1.55.tar.gz
Algorithm	Hash digest
SHA256	`f67e72dd72b35c5417939e2fc3e7ae6e914915649453efd193ff8fcfb8cc848c`
MD5	`584e6ef3ded5e1e7eba6a7b17911c7a7`
BLAKE2b-256	`daacf97120a0ab4f13723aec9114976000d2105864f3055d421f6e88670de8c1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for kbx-0.1.55.tar.gz:

Publisher: release.yml on tenfourty/kbx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: kbx-0.1.55.tar.gz
- Subject digest: f67e72dd72b35c5417939e2fc3e7ae6e914915649453efd193ff8fcfb8cc848c
- Sigstore transparency entry: 1109944120
- Sigstore integration time: Mar 16, 2026
Source repository:
- Permalink: tenfourty/kbx@4717a2e74d93937afa1125a64410f4ad2e8f9a46
- Branch / Tag: refs/heads/main
- Owner: https://github.com/tenfourty
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@4717a2e74d93937afa1125a64410f4ad2e8f9a46
- Trigger Event: push

File details

Details for the file kbx-0.1.55-py3-none-any.whl.

File metadata

Download URL: kbx-0.1.55-py3-none-any.whl
Upload date: Mar 16, 2026
Size: 155.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kbx-0.1.55-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bb08ee507e0dc238bfef6469cb699063036383e04d6b04197cf495e03596c7c6`
MD5	`bfa08597e9ce14f7135f273dcf8e6b62`
BLAKE2b-256	`21de8888f03cc5122fd951574bd1f012c7d7f79d66f3c7b5f17f017ede5be247`

See more details on using hashes here.

Provenance

The following attestation bundles were made for kbx-0.1.55-py3-none-any.whl:

Publisher: release.yml on tenfourty/kbx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: kbx-0.1.55-py3-none-any.whl
- Subject digest: bb08ee507e0dc238bfef6469cb699063036383e04d6b04197cf495e03596c7c6
- Sigstore transparency entry: 1109944125
- Sigstore integration time: Mar 16, 2026
Source repository:
- Permalink: tenfourty/kbx@4717a2e74d93937afa1125a64410f4ad2e8f9a46
- Branch / Tag: refs/heads/main
- Owner: https://github.com/tenfourty
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@4717a2e74d93937afa1125a64410f4ad2e8f9a46
- Trigger Event: push

kbx 0.1.55

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

kbx — Local Knowledge Base with Hybrid Search

Quick Start

Using with AI Agents

MCP Server

Python API

Architecture

Search

Entity System

Sync & Ingest

Configuration

Optional Extras

Data Storage

Development

Documentation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance