Add your description here
Project description
MCP-RAG: Low-Latency RAG Service
基于 MCP (Model Context Protocol) 协议的低延迟 RAG (Retrieval-Augmented Generation) 服务架构。
特性
- 极低延迟 (<100ms) 本地知识检索
- 双模式支持: Raw 模式 (直接检索) 和 Summary 模式 (检索+摘要)
- LLM 总结功能: 支持 Doubao、Ollama 等 LLM 提供商进行智能摘要
- 模块化架构: MCP Server 作为统一知识接口层
- 异步优化: 异步调用与模型预热机制
- 可扩展设计: 预留 reranker 与缓存模块接口
技术栈
- 后端框架: FastAPI
- 向量数据库: ChromaDB (本地部署)
- 嵌入模型: Doubao 嵌入 API (默认), 本地模型可选 (m3e-small / e5-small via sentence-transformers)
- LLM 模型: Doubao API, Ollama (本地部署)
- 协议: MCP (Model Context Protocol)
- 包管理: uv (现代化 Python 包管理器)
快速开始
1. 环境要求
- Python >= 3.13
- uv 包管理器
2. 安装依赖
# 基础安装 (仅云端API)
uv sync
# 如果需要使用本地embedding模型 (m3e-small, e5-small)
uv sync --extra local-embeddings
3. 启动服务
uv run mcp-rag serve
首次启动会报错(懒得改)
配置好配置文件就没问题了
web配置页面
uv run mcp-rag web
- 访问配置页面:
http://localhost:8000/config-page - 访问资料管理页面:
http://localhost:8000/documents-page - 使用 HTTP API:
http://localhost:8000/docs(Swagger UI)
4. 配置管理
MCP-RAG 现在使用 JSON 文件进行持久化配置管理
data\config.json 文件存储配置信息,支持通过 Web 界面进行修改和保存。
默认配置示例:
{
"host": "0.0.0.0",
"port": 8000,
"http_port": 8000,
"debug": false,
"vector_db_type": "chroma",
"chroma_persist_directory": "./data/chroma",
"qdrant_url": "http://localhost:6333",
"embedding_provider": "doubao",
"embedding_model": "doubao-embedding-text-240715",
"embedding_device": "cpu",
"embedding_cache_dir": null,
"embedding_api_key": "KEY-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"embedding_base_url": "https://ark.cn-beijing.volces.com/api/v3",
"llm_provider": "doubao",
"llm_model": "doubao-seed-1-6-flash-250828",
"llm_base_url": "https://ark.cn-beijing.volces.com/api/v3",
"llm_api_key": "KEY-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"enable_llm_summary": false,
"enable_thinking": false,
"max_retrieval_results": 5,
"similarity_threshold": 0.7,
"enable_reranker": false,
"enable_cache": false
}
MCP 服务器配置
小智go服务端能通过 MCP 协议与 MCP-RAG 进行交互。以下是一个示例配置:
{
"mcpServers": {
"rag": {
"command": "uv",
"args": [
"run",
"mcp-rag",
"serve"
],
"env": {
"PYTHONUNBUFFERED": "1",
"MODEL_TYPE": "OPENAI",
"OPENAI_API_KEY": "aa2ae42b-c82b-41ec-bf4e-51c8ab0e4d78",
"OPENAI_API_BASE": "https://ark.cn-beijing.volces.com/api/v3",
"OPENAI_MODEL": "doubao-1-5-pro-32k-250115",
"OPENAI_TEMPERATURE": "0",
"EMBEDDING_PROVIDER": "OPENAI",
"OPENAI_EMBEDDING_MODEL": "doubao-embedding-text-240715",
"COLLECTION_NAME": "default_collection"
}
}
}
}
5. 使用 MCP 工具
{
"name": "rag_ask",
"arguments": {
"query": "查询内容",
"mode": "raw",
"limit": 5
}
}
许可证
MIT License
贡献
欢迎提交 Issue 和 Pull Request!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mcp_rag-0.4.4.tar.gz
(25.9 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
mcp_rag-0.4.4-py3-none-any.whl
(27.8 kB
view details)
File details
Details for the file mcp_rag-0.4.4.tar.gz.
File metadata
- Download URL: mcp_rag-0.4.4.tar.gz
- Upload date:
- Size: 25.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
85f21896a48b122fef8abf257a93f2ebe364f68a26d95619dd7bbf717a521e40
|
|
| MD5 |
ee567ce9bff86188b5a01610c46337f8
|
|
| BLAKE2b-256 |
5b53f5dd9708ed725a300cb74cab5b9830108827d39b72b967643687e039c612
|
File details
Details for the file mcp_rag-0.4.4-py3-none-any.whl.
File metadata
- Download URL: mcp_rag-0.4.4-py3-none-any.whl
- Upload date:
- Size: 27.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9f1fbc1f236d1e13aac31c76acc068a8f39600b515d47230ea34a62c375989d1
|
|
| MD5 |
87229aba830e7cddd4f75638148f0677
|
|
| BLAKE2b-256 |
de3dd9967e3406ba2fca3947a2fafb5101dd475be39eb2206bbbfdfd57d76458
|