A MCP server designed to bridge the gap between specialized knowledge domains and AI assistants.
Project description
knowledge-mcp: Specialized Knowledge Bases for AI Agents
1. Overview and Concept
knowledge-mcp is a MCP server designed to bridge the gap between specialized knowledge domains and AI assistants. It allows users to create, manage, and query dedicated knowledge bases, making this information accessible to AI agents through an MCP (Model Context Protocol) server interface.
The core idea is to empower AI assistants that are MCP clients (like Claude Desktop or IDEs like Windsurf) to proactively consult these specialized knowledge bases during their reasoning process (Chain of Thought), rather than relying solely on general semantic search against user prompts or broad web searches. This enables more accurate, context-aware responses when dealing with specific domains.
Key components:
- CLI Tool: Provides a user-friendly command-line interface for managing knowledge bases (creating, deleting, adding/removing documents, configuring, searching).
- Knowledge Base Engine: Leverages LightRAG to handle document processing, embedding, knowledge graph creation, and complex querying.
- MCP Server: Exposes the search functionality of the knowledge bases via the FastMCP protocol, allowing compatible AI agents to query them directly.
2. About LightRAG
This project utilizes LightRAG (HKUDS/LightRAG) as its core engine for knowledge base creation and querying. LightRAG is a powerful framework designed to enhance Large Language Models (LLMs) by integrating Retrieval-Augmented Generation (RAG) with knowledge graph techniques.
Key features of LightRAG relevant to this project:
- Document Processing Pipeline: Ingests documents (PDF, Text, Markdown, DOCX), chunks them, extracts entities and relationships using an LLM, and builds both a knowledge graph and vector embeddings.
- Multiple Query Modes: Supports various retrieval strategies (e.g., vector similarity, entity-centric, relationship-focused, hybrid) to find the most relevant context for a given query.
- Flexible Storage: Can use different backends for storing key-value data, vectors, graph information, and document status (this project uses the default file-based storage).
- LLM/Embedding Integration: Supports various providers like OpenAI (used in this project), Ollama, Hugging Face, etc.
By using LightRAG, knowledge-mcp benefits from advanced RAG capabilities that go beyond simple vector search.
3. Installation
Ensure you have Python 3.12 and uv installed.
-
Clone the repository:
git clone https://github.com/olafgeibig/knowledge-mcp.git cd knowledge-mcp
-
Create a virtual environment and install dependencies using uv:
python -m venv .venv source .venv/bin/activate # Or .\.venv\Scripts\activate on Windows uv pip install -e ".[dev]"
Installing with
-e .makes the package editable and installs dev dependencies. -
Set up configuration:
- Copy
config.example.yamltoconfig.yaml. - Copy
.env.exampleto.env. - Edit
config.yamland.envto add your API keys (e.g.,OPENAI_API_KEY) and adjust paths or settings as needed. Theknowledge_base.base_dirinconfig.yamlspecifies where your knowledge base directories will be created.
- Copy
4. Usage (CLI)
The primary way to interact with knowledge-mcp is through its CLI, accessed via the knowledge-mcp command (if installed globally or via uvx knowledge-mcp within the activated venv).
All commands require the --config option pointing to your main configuration file.
knowledge-mcp --config config.yaml <command> [arguments...]
Available Commands:
| Command | Description | Arguments | Status |
|---|---|---|---|
create |
Creates a new knowledge base directory and initializes its structure. | <kb-name>: Name of the knowledge base to create. |
Implemented |
delete |
Deletes an existing knowledge base directory and all its contents. | <kb-name>: Name of the knowledge base to delete. |
Implemented |
list |
Lists all available knowledge bases found in the base_dir. |
N/A | Implemented |
add |
Adds a document: processes, chunks, embeds, stores in the specified KB. | <kb-name>: Target KB.<path>: Path to the document file. |
Implemented |
remove |
Removes a document and its associated embeddings from the KB. | <kb-name>: Target KB.<doc_name>: Name/ID of the document to remove. |
Implemented |
config |
Manages the KB-specific config.yaml (query parameters). |
<kb_name>: Target KB.`[show |
edit]`: Subcommand (show default). |
search |
Searches the specified knowledge base using LightRAG. | <kb-name>: Target KB.<query>: Your search query text. |
Implemented |
mcp |
Runs the MCP server to expose the search functionality to AI agents. |
N/A | Pending |
shell |
Starts an interactive shell session with all commands available. | N/A | Implemented |
exit |
(Within shell) Exits the interactive shell. | N/A | Implemented |
help |
(Within shell) Shows available commands and their usage. | [command] (Optional command name) |
Implemented |
Example:
# Create a knowledge base named 'my_docs'
knowledge-mcp --config config.yaml create my_docs
# Add a document to it
knowledge-mcp --config config.yaml add my_docs ./path/to/mydocument.pdf
# Search the knowledge base
knowledge-mcp --config config.yaml search my_docs "What is the main topic?"
# Start the interactive shell
knowledge-mcp --config config.yaml shell
(kbmcp) list
(kbmcp) search my_docs "Another query"
(kbmcp) exit
5. Configuration
Configuration is managed via YAML files:
-
Main Configuration (
config.yaml): Defines global settings like the knowledge base directory (knowledge_base.base_dir), LightRAG parameters (LLM provider/model, embedding provider/model, API keys via${ENV_VAR}substitution), and logging settings.# Example structure (see config.example.yaml for full details) knowledge_base: base_dir: ./kbs # Default directory for KBs lightrag: llm: provider: "openai" model_name: "gpt-4.1-nano" api_key: "${OPENAI_API_KEY}" # ... other LLM settings embedding: provider: "openai" model_name: "text-embedding-3-small" api_key: "${OPENAI_API_KEY}" # ... other embedding settings embedding_cache: enabled: true similarity_threshold: 0.90 logging: level: "INFO" # ... logging settings env_file: .env # Optional path to .env file
-
Knowledge Base Specific Configuration (
<base_dir>/<kb_name>/config.yaml): Contains parameters specific to querying that knowledge base, such as the LightRAG querymode,top_kresults, context token limits, etc. This file is automatically created with defaults when a KB is created and can be viewed/edited using theconfigCLI command.
6. Development
- Tech Stack: Python 3.12, uv (dependency management), hatchling (build system), pytest (testing).
- Setup: Follow the installation steps, ensuring you install with
uv pip install -e ".[dev]". - Code Style: Adheres to PEP 8.
- Testing: Run tests using
uvx testorpytest. - Dependencies: Managed in
pyproject.toml. Useuv pip install <package>to add anduv pip uninstall <package>to remove dependencies, updatingpyproject.tomlaccordingly. - Scripts: Common tasks might be defined under
[project.scripts]inpyproject.toml. - MCP Inspector: Use
npx @modelcontextprotocol/inspector uv run cli --config ./kbs/config.yaml serveto start the MCP inspector.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file knowledge_mcp-0.1.0.tar.gz.
File metadata
- Download URL: knowledge_mcp-0.1.0.tar.gz
- Upload date:
- Size: 106.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
431b3f5e730a58bd9318b978bab8f23a7ad519355173e2ace0f8f7416e5fc75a
|
|
| MD5 |
bb1baa91ffddfbaecf6b63b6bf485c4b
|
|
| BLAKE2b-256 |
ee6e21efe86b51adb080a72c4716205112736446f27163ea59e4d42dc4f1ea03
|
File details
Details for the file knowledge_mcp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: knowledge_mcp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 27.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76b8b6cfa86ba6ded041633c2aa90c9415bac02728c48de35ef00a0b5bcb260c
|
|
| MD5 |
f7a35defb4e895063dc6af54a4dd1dbd
|
|
| BLAKE2b-256 |
e50495df1b53d47c76087f0eaa050c0a5e177c42ecea8555e675157819266737
|