100% offline RAG storage and MCP server for querying local document knowledge bases

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

RAG Knowledge Base

100% offline RAG storage and MCP server for querying local document knowledge bases. Designed for high-quality retrieval over large corpora (5–10 GB+) with GPU acceleration, hybrid search, and cross-encoder reranking.

Features

Core

Fully offline — no internet required after initial setup
30+ file formats — Markdown, PDF, DOCX, PPTX, XLSX, CSV, JSON, YAML, XML, HTML, RST, RTF, ODT/ODS/ODP, EPUB, images (OCR), source code, logs, and plain text
Multiple RAG databases — create, switch, and manage independent knowledge bases
Shareable — export a RAG as a single .rag file; import on another machine
MCP server — query from VS Code Copilot, Claude Desktop, or any MCP client
Web UI — NiceGUI dashboard for search, status, and management
File watching — automatic re-indexing on file changes
Cross-platform — Windows, macOS, and Linux

Search Quality

Hybrid search — combines vector similarity (cosine) with BM25 keyword matching via configurable score fusion
Cross-encoder reranking — second-stage reranking with cross-encoder/ms-marco-MiniLM-L-6-v2 for precision
Structure-aware chunking — preserves Markdown heading hierarchy, PDF page boundaries, and code function/class boundaries
Contextual prefixes — each chunk carries its document title and section path for better embedding quality
Min-score filtering — automatically filters low-relevance noise from results
MMR diversification — Maximal Marginal Relevance reduces redundancy in results (configurable lambda)

Indexing Performance

GPU acceleration — auto-detects CUDA and Apple MPS; falls back to CPU
Parallel file parsing — multi-threaded document parsing with configurable worker count
Batched embedding — encodes chunks in large batches (default 512) for optimal throughput
Incremental indexing — xxHash-based file manifest tracks changes at O(1) cost; unchanged files are skipped instantly
Tuned HNSW — configurable ef_construction and M parameters for ChromaDB's vector index

Reliability

Cancellation — cooperative cancellation of long-running indexing; partial work is safely cleaned up and re-indexed on the next run
Crash recovery — lock file detection and manifest invalidation; the daemon auto-runs an incremental reindex on startup if a previous crash is detected
Index verification — rag-kb verify checks manifest ↔ vector store consistency and reports orphaned/invalidated files

Quick Start

Install from PyPI

pip install rag-knowledge-base

This installs everything needed to index, search, and serve via MCP — including parsers for PDF, DOCX, PPTX, XLSX, and RTF.

Optional extras

Extra	Install command	What it adds
`web`	`pip install rag-knowledge-base[web]`	NiceGUI web dashboard
`ocr`	`pip install rag-knowledge-base[ocr]`	Image OCR (Surya, RapidOCR)
`api`	`pip install rag-knowledge-base[api]`	OpenAI / Voyage embedding APIs
`monitoring`	`pip install rag-knowledge-base[monitoring]`	System metrics (psutil)
`all`	`pip install rag-knowledge-base[all]`	All of the above

Install from source

pip install -e .

GPU support: Install PyTorch with CUDA before installing this package:
pip install torch --index-url https://download.pytorch.org/whl/cu126

Create and Index a RAG

# Create a knowledge base pointing at your document folders
rag-kb create my-docs \
  --folders /path/to/docs /path/to/more/docs \
  --description "My documentation"

# Set as active
rag-kb use my-docs

# Index (auto-detects GPU, uses 14 parallel workers by default)
rag-kb index

# Re-run index any time — only changed files are processed
rag-kb index

Query via MCP (VS Code Copilot)

# Start MCP server (stdio mode for VS Code)
rag-kb serve

Add to your .vscode/mcp.json:

{
  "servers": {
    "rag-knowledge-base": {
      "command": "rag-kb",
      "args": ["serve"]
    }
  }
}

Query via MCP (HTTP)

# Start MCP server on HTTP
rag-kb serve --http --port 8080

Web Dashboard

# Launch NiceGUI dashboard
rag-kb ui

Share a RAG

# Export
rag-kb export my-docs --output my-docs.rag

# Import (on another machine)
rag-kb import my-docs.rag --name received-docs

Supported File Formats

Category	Extensions
Documents	`.md`, `.markdown`, `.txt`, `.text`, `.pdf`, `.docx`, `.pptx`, `.rst`, `.rest`, `.rtf`, `.epub`
Spreadsheets	`.xlsx`, `.xls`, `.csv`, `.tsv`
Data	`.json`, `.jsonl`, `.yaml`, `.yml`, `.xml`, `.xsl`, `.xslt`, `.xsd`, `.svg`, `.rss`, `.atom`
Web	`.html`, `.htm`
OpenDocument	`.odt`, `.ods`, `.odp`
Images (OCR)	`.png`, `.jpg`, `.jpeg`, `.gif`, `.bmp`, `.tiff`, `.tif`, `.webp`, `.ico`, `.heic`, `.heif`
Code	`.py`, `.js`, `.mjs`, `.cjs`, `.jsx`, `.ts`, `.tsx`, `.java`, `.c`, `.h`, `.cpp`, `.cxx`, `.cc`, `.hpp`, `.hxx`, `.cs`, `.go`, `.rs`, `.rb`, `.php`, `.swift`, `.kt`, `.kts`, `.scala`, `.m`, `.mm`, `.r`, `.R`, `.lua`, `.pl`, `.pm`, `.sh`, `.bash`, `.zsh`, `.fish`, `.ps1`, `.psm1`, `.bat`, `.cmd`, `.sql`, `.dart`, `.ex`, `.exs`, `.erl`, `.hrl`, `.hs`, `.ml`, `.mli`, `.fs`, `.fsx`, `.clj`, `.cljs`, `.groovy`, `.gradle`, `.vim`, `.el`, `.zig`, `.v`, `.nim`, `.tf`, `.hcl`, `.proto`, `.graphql`, `.gql`, `.vue`, `.svelte`, `.toml`, `.ini`, `.cfg`, `.conf`, `.env`, `.properties`, `.makefile`, `.cmake`, `.dockerfile`
Logs	`.log`

CLI Reference

Command	Description
`rag-kb create NAME --folders ...`	Create a new RAG database
`rag-kb list`	List all RAG databases
`rag-kb use NAME`	Set the active RAG
`rag-kb delete NAME [-y]`	Delete a RAG database
`rag-kb detach NAME`	Detach a RAG from source files (read-only mode)
`rag-kb attach NAME`	Re-attach a detached RAG to its source files
`rag-kb index [--rag NAME] [--workers N]`	Index documents (parallel, incremental)
`rag-kb cancel-index`	Cancel a running indexing operation
`rag-kb verify [--rag NAME]`	Verify index consistency (manifest vs vector store)
`rag-kb search QUERY [-n N]`	Search the active RAG
`rag-kb status [--rag NAME]`	Show index status and statistics
`rag-kb files [--rag NAME]`	List indexed files
`rag-kb export NAME --output FILE`	Export RAG to a `.rag` file
`rag-kb import FILE [--name NAME]`	Import RAG from a `.rag` file
`rag-kb serve [--http] [--port N]`	Start MCP server
`rag-kb ui [--port N]`	Launch web dashboard
`rag-kb config`	Show current configuration
`rag-kb download-models [--output DIR]`	Pre-download ML models for offline/bundled use
`rag-kb models list\|info\|download\|delete`	Manage embedding/reranker models
`rag-kb daemon status\|stop\|restart\|logs`	Manage the background daemon
`rag-kb stats [CATEGORY]`	Show detailed metrics and monitoring statistics
`rag-kb monitor [--interval N]`	Live monitoring dashboard (like htop for RAG)

Use -v for verbose output: rag-kb -v index

MCP Tools

When connected via MCP, the following tools are available:

Tool	Description
`search_knowledge_base`	Hybrid search with optional cross-encoder reranking
`get_document_content`	Retrieve all chunks from a specific document
`list_indexed_files`	List all indexed files with metadata
`get_index_status`	Current indexing state and statistics
`reindex`	Trigger re-indexing of the active RAG
`cancel_indexing`	Cancel a running indexing operation
`verify_index_consistency`	Check manifest ↔ vector store consistency
`list_rags`	List all available RAG databases
`switch_rag`	Switch the active RAG database
`create_rag`	Create a new RAG knowledge base
`delete_rag`	Permanently delete a RAG database
`export_rag`	Export a RAG to a shareable file
`import_rag`	Import a RAG from a shared file
`detach_rag`	Detach/re-attach a RAG from source files
`list_models`	List available embedding/reranker models
`get_model_info`	Get detailed info about a model
`get_monitoring_metrics`	Get comprehensive monitoring dashboard
`get_indexing_history`	Get recent indexing run history
`get_search_stats`	Get recent search query performance stats
`get_embedding_stats`	Get recent embedding batch performance stats
`get_vector_store_details`	Get detailed vector store / ChromaDB health info

Configuration

Config file location (platform-dependent):

Windows: %LOCALAPPDATA%\rag-kb\config.yaml
macOS: ~/Library/Application Support/rag-kb/config.yaml
Linux: ~/.local/share/rag-kb/config.yaml

# Embedding
embedding_model: paraphrase-multilingual-MiniLM-L12-v2
chunk_size: 1024
chunk_overlap: 128

# Search quality
reranking_enabled: true
reranker_model: cross-encoder/ms-marco-MiniLM-L-6-v2
hybrid_search_enabled: true
hybrid_search_alpha: 0.7        # 0.0 = pure BM25, 1.0 = pure vector
min_score_threshold: 0.15
mmr_enabled: true
mmr_lambda: 0.7                 # 0.0 = max diversity, 1.0 = max relevance

# Indexing performance
indexing_workers: 14             # parallel file-parsing threads
embedding_batch_size: 512        # texts per encode() call

# ChromaDB HNSW tuning
hnsw_ef_construction: 256
hnsw_m: 48

# File types to index
supported_extensions:
  - .md
  - .txt
  - .pdf
  - .docx
  - .pptx
  - .xlsx
  - .csv
  - .json
  - .yaml
  - .xml
  - .html
  - .htm
  - .rst
  - .rtf
  - .odt
  - .epub
  - .png
  - .jpg
  # ... see config for full list

# Server
host: 127.0.0.1
port: 8080

Daemon Management

The daemon is a background process that manages all data access. It starts automatically on first CLI use and shuts down after an idle timeout.

# Check daemon status
rag-kb daemon status

# View daemon logs
rag-kb daemon logs

# Follow logs in real-time (like tail -f)
rag-kb daemon logs -f

# Show last 100 lines
rag-kb daemon logs -n 100

# Restart the daemon
rag-kb daemon restart

# Stop the daemon
rag-kb daemon stop

Crash Recovery

If indexing is interrupted (crash, power loss, kill), the daemon automatically detects and recovers on the next startup:

Lock file detection — a .indexing_lock file in the RAG database marks active indexing
Manifest invalidation — partially processed files are pre-marked so is_changed() returns True
Auto-recovery — the daemon runs an incremental reindex on startup if a previous crash is detected

You can also manually check index consistency:

# Verify manifest ↔ vector store consistency
rag-kb verify

# Repair by re-indexing
rag-kb index

Architecture

Documents  ──►  Parsers (30+ formats)
                    │
                    ▼
             Structure-aware Chunker
             (headings / pages / functions)
                    │
                    ▼
             Embedding Model (CUDA / MPS / CPU)
             + Contextual Prefixes
                    │
                    ▼
             ChromaDB (HNSW cosine index)
                    │
          ┌─────────┼─────────┐
          ▼         ▼         ▼
       MCP Server  Web UI   CLI
          │
    ┌─────┼─────┐
    ▼     ▼     ▼
  Vector  BM25  Hybrid Fusion
    │           │
    ▼           ▼
  Cross-encoder Reranking
    │
    ▼
  Results

License

MIT License — see LICENSE for details.

All dependencies use MIT, Apache 2.0, or BSD licenses.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

yohasacura

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.1

Mar 1, 2026

1.0.0

Feb 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_knowledge_base-1.0.1.tar.gz (151.0 kB view details)

Uploaded Mar 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rag_knowledge_base-1.0.1-py3-none-any.whl (173.5 kB view details)

Uploaded Mar 1, 2026 Python 3

File details

Details for the file rag_knowledge_base-1.0.1.tar.gz.

File metadata

Download URL: rag_knowledge_base-1.0.1.tar.gz
Upload date: Mar 1, 2026
Size: 151.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rag_knowledge_base-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`d5d0e582b2bffca4b263859a4dcafb8f573a03c04c54dd50a98d234ed8cce1c9`
MD5	`51461007e97f18e1165089a3b4b1bcf1`
BLAKE2b-256	`0bfa8b69bfba121d4c8172c7dad33db0786a3d3441a3205fd4a2334d3cb9cf18`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rag_knowledge_base-1.0.1.tar.gz:

Publisher: release.yml on yohasacura/rag-knowledge-base

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rag_knowledge_base-1.0.1.tar.gz
- Subject digest: d5d0e582b2bffca4b263859a4dcafb8f573a03c04c54dd50a98d234ed8cce1c9
- Sigstore transparency entry: 1006819932
- Sigstore integration time: Mar 1, 2026
Source repository:
- Permalink: yohasacura/rag-knowledge-base@010214c9672096ad34e94266607042a93fa817e3
- Branch / Tag: refs/tags/v1.0.1
- Owner: https://github.com/yohasacura
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@010214c9672096ad34e94266607042a93fa817e3
- Trigger Event: push

File details

Details for the file rag_knowledge_base-1.0.1-py3-none-any.whl.

File metadata

Download URL: rag_knowledge_base-1.0.1-py3-none-any.whl
Upload date: Mar 1, 2026
Size: 173.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rag_knowledge_base-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`55db3fee248a90dab85563ee415c25c360567e1c77aa5b3ad457892f5a36eb65`
MD5	`6b2ec6ee18caffc3dfc5c9d07e187f30`
BLAKE2b-256	`eab7b13fe671c20048c9b29f6d495fd76a58012ea5958ecdf0ffc79aa08963e3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rag_knowledge_base-1.0.1-py3-none-any.whl:

Publisher: release.yml on yohasacura/rag-knowledge-base

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rag_knowledge_base-1.0.1-py3-none-any.whl
- Subject digest: 55db3fee248a90dab85563ee415c25c360567e1c77aa5b3ad457892f5a36eb65
- Sigstore transparency entry: 1006819953
- Sigstore integration time: Mar 1, 2026
Source repository:
- Permalink: yohasacura/rag-knowledge-base@010214c9672096ad34e94266607042a93fa817e3
- Branch / Tag: refs/tags/v1.0.1
- Owner: https://github.com/yohasacura
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@010214c9672096ad34e94266607042a93fa817e3
- Trigger Event: push

rag-knowledge-base 1.0.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

RAG Knowledge Base

Features

Core

Search Quality

Indexing Performance

Reliability

Quick Start

Install from PyPI

Optional extras

Install from source

Create and Index a RAG

Query via MCP (VS Code Copilot)

Query via MCP (HTTP)

Web Dashboard

Share a RAG

Supported File Formats

CLI Reference

MCP Tools

Configuration

Daemon Management

Crash Recovery

Architecture

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance