Local CLI search engine for personal knowledge bases with hybrid BM25 + vector search
Project description
mykb
本地知识库搜索引擎 CLI,支持 BM25 全文搜索 + 向量语义搜索的混合检索。
特性
- Hybrid Search: BM25 全文 + 向量语义混合搜索
- 本地 Embedding: 使用
embeddinggemma-300m本地生成 embedding,无需 API - 多数据源: 目前支持 Obsidian vault,可扩展 Twitter、Telegram 等
- 增量索引: 基于内容 hash,只索引变化的文档
- 可插拔后端: 支持 Meilisearch 和 SeekDB
安装
# 基础安装(Meilisearch 后端)
pip install mykb
# 带 SeekDB 支持
pip install mykb[seekdb]
依赖
- Meilisearch 后端: 需要运行 Meilisearch 服务
- SeekDB 后端: 嵌入式模式无需外部服务(macOS 15+ / Linux)
快速开始
# 1. 配置 Meilisearch
mykb config set meilisearch.url http://localhost:7700
# 2. 添加 collection(Obsidian vault)
mykb collection add my-notes --path ~/Documents/Obsidian --source obsidian
# 3. 索引(含 embedding)
mykb index my-notes --embed
# 4. 搜索
mykb search "机器学习" # BM25 全文
mykb vsearch "AI 技术趋势" # 向量语义
mykb query "深度学习入门" --ratio 0.5 # 混合搜索
配置
配置文件: ~/.mykb/config.toml
[backend]
type = "meilisearch" # 或 "seekdb"
[backend.meilisearch]
url = "http://localhost:7700"
api_key = ""
[backend.seekdb]
path = "~/.mykb/seekdb.db" # 嵌入式模式
fulltext_analyzer = "ik" # ik | space | ngram
[embedding]
model = "google/embeddinggemma-300m"
chunk_size = 800
chunk_overlap = 0.15
[collections.my-notes]
source = "obsidian"
path = "/path/to/vault"
mask = "**/*.md"
exclude = [".obsidian/**", ".trash/**"]
命令
| 命令 | 说明 |
|---|---|
mykb collection add/ls/rm |
管理 collection |
mykb index [--embed] [--full] |
索引文档 |
mykb embed [--full] |
补充 embedding |
mykb search <query> |
BM25 全文搜索 |
mykb vsearch <query> |
向量语义搜索 |
mykb query <query> [--ratio] |
混合搜索 |
mykb status |
查看状态 |
后端对比
| 特性 | Meilisearch | SeekDB |
|---|---|---|
| 部署 | 独立服务 | 嵌入式 / 服务 |
| BM25 速度 | ⚡ 快 (~2ms) | 慢 (~30ms) |
| Vector 速度 | 快 (~3ms) | ⚡ 更快 (~1ms) |
| Hybrid 速度 | ⚡ 快 (~4ms) | 慢 (~33ms) |
| 资源占用 | 低 | 高 |
建议:
- 通用场景用 Meilisearch
- 纯向量搜索场景可考虑 SeekDB
开发
# 安装开发依赖
uv sync --all-extras
# 运行测试
uv run pytest
# Benchmark
uv run python scripts/benchmark_seekdb.py --backend meilisearch -n 1000
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mykb-0.1.0.tar.gz
(17.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
mykb-0.1.0-py3-none-any.whl
(23.9 kB
view details)
File details
Details for the file mykb-0.1.0.tar.gz.
File metadata
- Download URL: mykb-0.1.0.tar.gz
- Upload date:
- Size: 17.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81673d06f6e052c7e2be293a15c2c5dc244b1cc86945b440d2eb93dbb9613221
|
|
| MD5 |
4f58cf82e17fe19e6c920604f2a6576a
|
|
| BLAKE2b-256 |
ea536b4e3ae7edd120679a0ebf4f72c1094de5c952918389cc93d7558c4f9165
|
File details
Details for the file mykb-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mykb-0.1.0-py3-none-any.whl
- Upload date:
- Size: 23.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e338f96c76e357f3caf35e05692ff886bb396cc2f616e84862b556e5253a6d87
|
|
| MD5 |
b66b4be841bd939d9ab0991c0df760c3
|
|
| BLAKE2b-256 |
57829c4eed7f7ee161e633ebb0d8e952ed66ba5fdab1c4a02df2db14961e5465
|