MCP Server for zmp knowledge base
Project description
ZMP Knowledge Store MCP Server
A high-performance MCP (Model Context Protocol) server for managing knowledge store content with multi-modal RAG capabilities, powered by Qdrant vector store with hybrid search, chat history logging, configurable collections, and advanced image analysis features.
Latest Version: 0.3.9.2 - Enhanced with PyTorch security fixes, manual RRF implementation, and optimized Docker builds.
Recent Updates (v0.3.9.2)
๐ Security & Compatibility Fixes
- PyTorch Security: Upgraded to PyTorch 2.6.0+cpu to resolve CVE-2025-32434 vulnerability
- Torchvision Compatibility: Fixed torchvision 0.21.0+cpu compatibility with CPU-only PyTorch
- NumPy/SciPy Stability: Resolved compatibility issues with Python 3.12 and ML dependencies
๐ Performance Improvements
- Manual RRF Implementation: Replaced
ranxdependency with custom Reciprocal Rank Fusion (RRF) implementation - Docker Optimizations: CPU-only PyTorch builds for faster deployment and smaller images
- Network Resilience: Enhanced SSL handling and timeout configurations for reliable builds
๐ Bug Fixes
- Startup Warnings: Removed
get_collection_statswarning for cleaner startup logs - Dependency Management: Improved Poetry + pip hybrid installation strategy
- Error Handling: Enhanced error handling and logging throughout the application
Python File Structure
zmp_knowledge_store/
โโโ __init__.py # Package initialization & metadata
โโโ config.py # Configuration management with S3 integration
โโโ knowledge_store.py # Main knowledge store logic (ingestion, search, multi-modal RAG)
โโโ qdrant_adapter.py # Qdrant vector DB integration with hybrid search & manual RRF
โโโ keyword_extractor.py # Advanced keyword extraction
โโโ utils.py # Utility functions
โโโ server_main.py # FastMCP server main entry point
tests/
โโโ test_client.py # MCP client integration tests
โโโ test_hybrid_search.py # Hybrid search validation
โโโ test_collection_*.py # Configurable collection tests
โโโ test_*_chat_history.py # Chat history logging and retrieval tests
โโโ enhance_pictures.py # Image enhancement and vision model tests
โโโ analyze_*.py # Image analysis and validation scripts
โโโ ... # Other tests and utilities (80+ files)
examples/
โโโ zcp/ # ZCP (CloudZ Container Platform) documentation
โโโ amdp/ # AMDP (Application Modernization & Development Platform) docs
โโโ apim/ # APIM (API Management) documentation
โโโ assets/ # Shared assets and images
Key Features
1. Multi-Modal Knowledge Store (knowledge_store.py)
- Handles ingestion, chunking, and search for multi-modal documentation (text + images)
- Powered by Qdrant vector store with hybrid search (dense + sparse vectors)
- SmolDocling integration for document parsing and layout understanding
- Advanced image analysis with vision language model integration
- Automatic image description generation and enhancement
- Configurable collections for multi-tenant knowledge organization
- Robust metadata enrichment and error handling
2. Hybrid Search with Manual RRF (qdrant_adapter.py)
- Custom RRF Implementation: Manual Reciprocal Rank Fusion algorithm for optimal search results
- Dense + Sparse Vectors: Combines semantic similarity with keyword-based search
- No External Dependencies: Self-contained RRF implementation without
ranxdependency - Configurable Parameters: Adjustable RRF constant (k=60) for different use cases
3. Chat History Management
- Chat history logging with deduplication
- Hybrid search over chat history with clustering and semantic similarity
- User and session-based filtering
- Analytics and debugging support
4. Qdrant Vector Store Integration
qdrant_adapter.py: Advanced Qdrant integration with hybrid search (dense + sparse vectors)- Dynamic collection creation and management with configurable parameters
- Template-based collection creation with automatic vector configuration inheritance
- Comprehensive metadata filtering and upsert operations
5. Advanced Text Processing
keyword_extractor.py: Multi-method keyword extraction (KeyBERT, spaCy, NLTK)- Solution-specific optimization for ZCP, AMDP, and APIM platforms
- Adaptive keyword counts and domain vocabulary boosting
6. Configuration & Utilities
config.py: Environment-based configuration with S3 integrationutils.py: Chunking, document creation, and helper functionsserver_main.py: FastMCP server with tool endpoints
7. Enhanced Image Analysis
- Vision language model integration for automatic image description generation
- MDX image validation and enhanced asset mapping
- Automatic image enhancement during document ingestion
- SmolDocling integration for picture detection and processing
- S3 integration for image asset management
8. Comprehensive Testing
- 80+ test files and analysis scripts covering all functionality
- Configurable collection parameter validation
- Hybrid search and image analysis testing
- Multi-platform compatibility testing
MCP Tools Implemented
| Tool Name | Request Schema | Response Schema | Description |
|---|---|---|---|
| ingest_documents | {documents: list, solution?: str, collection?: str} |
{ success: bool, results: list, total_page_count?: int } |
Ingest documents with metadata, keyword extraction, and automatic image enhancement. Supports configurable collections. |
| search_knowledge | {query: str, n_results?: int, collection?: str} |
{ query: str, results: list } |
Hybrid search (dense + sparse) over the knowledge store. Supports configurable collections with auto-creation. |
| log_chat_history | {query: str, response: str, user_id?: str, session_id?: str} |
{ success: bool, id?: str, error?: str } |
Log query/response pairs with deduplication by (query, user_id). |
| search_chat_history | {query: str, user_id?: str, n_results?: int} |
{ query: str, user_id?: str, results: list } |
Hybrid search over chat history with optional user filtering. |
Usage Examples
# Ingest documents (see examples/ for sample MDX and images)
result = await client.call_tool("ingest_documents", {
"documents": [...],
"solution": "zcp",
"collection": "my-custom-collection" # Optional: auto-creates if not exists
})
# Search knowledge
result = await client.call_tool("search_knowledge", {
"query": "Group Management",
"n_results": 3,
"collection": "my-custom-collection" # Optional: defaults to solution-docs
})
# Log chat history
result = await client.call_tool("log_chat_history", {
"query": "What is a group?",
"response": "A group is ...",
"user_id": "user1"
})
# Search chat history
result = await client.call_tool("search_chat_history", {
"query": "Delete a group",
"n_results": 5
})
Example Documentation Sets
examples/zcp/: CloudZ Container Platform documentation with tutorials and guidesexamples/amdp/: Application Modernization & Development Platform documentationexamples/apim/: API Management platform documentation and tutorials- Each directory contains MDX files with embedded images and comprehensive documentation
Document Processing Pipeline
Supported File Types and Processing Methods
The ZMP Knowledge Store supports multiple input formats with specialized processing for each type:
1. MDX Files (.md, .mdx) - Primary Format
Pipeline: MDX โ PDF โ Images โ SmolDocling โ Chunks โ Vector Store
Processing Steps:
- Frontmatter Extraction: YAML frontmatter parsed for metadata (title, solution, keywords)
- Asset Resolution: Referenced images downloaded from S3 and mapped to local paths
- PDF Conversion: Pandoc converts MDX to PDF using XeLaTeX engine with Noto Sans Symbols font
- Page Rendering: PDF converted to individual PNG images using pdf2image
- SmolDocling Analysis: Each page processed by vision-language model for layout understanding
- Element Extraction: Text, tables, images, and structure extracted with page numbers
- Content Validation: OCR output validated against original MDX for accuracy correction
- Chunking: Elements grouped and split using character-based chunking (1024 chars, 128 overlap)
- Embedding: Dense (all-mpnet-base-v2) + sparse (SPLADE) vectors computed
- Storage: Chunks stored in Qdrant with metadata and S3 image references
SmolDocling Backend Selection:
- MLX Backend (
SMOLDOCLING_BACKEND=mlx): Apple Silicon optimized, usesds4sd/SmolDocling-256M-preview-mlx-bf16 - Transformers Backend (
SMOLDOCLING_BACKEND=transformers): Cross-platform, usesds4sd/SmolDocling-256M-preview - Auto-Detection (
SMOLDOCLING_BACKEND=auto): Platform-based selection (default)
2. PDF Files (.pdf) - Direct Processing
Pipeline: PDF โ Images โ SmolDocling โ Chunks โ Vector Store
Processing Steps:
- Page Extraction: PDF pages converted directly to PNG images
- SmolDocling Analysis: Same vision-language processing as MDX pipeline
- Element Processing: Text, tables, images extracted with page metadata
- Chunking & Storage: Same as MDX pipeline without MDX validation
3. Image Files (.png, .jpg, .jpeg) - Single Page
Pipeline: Image โ SmolDocling โ Chunks โ Vector Store
Processing Steps:
- Image Loading: Single image loaded as PIL Image object
- SmolDocling Processing: Vision-language model extracts content and structure
- Element Extraction: All content treated as single page (page_no=1)
- Chunking & Storage: Standard pipeline for extracted elements
Element Types and Processing
Text Elements
- Types:
text,paragraph,heading,title - Processing: Direct text extraction with markdown formatting preservation
- Chunking: Character-based splitting with semantic boundaries
- Metadata: Page number, chunk order, content type
Table Elements
- Type:
table,otsl(Open Table Schema Language) - Processing: OTSL serialized to markdown format for better searchability
- Fallback: Raw OTSL XML if markdown serialization fails
- Enhancement: Caption grouping with adjacent caption elements
Image Elements
- Type:
picture - Processing: Mapped to S3 URLs from asset metadata
- Validation: MDX image references validated against detected pictures
- Format: Markdown image syntax
 - Metadata: S3 keys stored in
assets_s3_keysfield
List Elements
- Types:
list_item,unordered_list,ordered_list - Processing: Items grouped and buffered until size threshold
- Chunking: Multiple list items combined into single chunks (up to 1024 chars)
- Structure: Nested lists flattened with proper indentation
Advanced Processing Features
Image Reference Validation
# Extract ordered image references from MDX
image_refs = re.findall(r"!\[[^\]]*\]\(([^)]+)\)", content)
# Map to S3 keys and validate against SmolDocling detection
ordered_image_ref_to_s3 = []
for ref in image_refs:
# Match with asset metadata and S3 keys
# Track assignment to prevent false positives
Content Correction Pipeline
def word_level_correction(chunk, mdx):
"""Correct OCR hallucinations using original MDX content"""
mdx_words = set(mdx.split())
chunk_words = chunk.split()
corrected_words = []
for word in chunk_words:
# Find exact matches or closest alternatives
if any(word.lower() == w.lower() for w in mdx_words):
corrected_words.append(word)
else:
matches = difflib.get_close_matches(word, mdx_words, n=1, cutoff=0.8)
corrected_words.append(matches[0] if matches else word)
return " ".join(corrected_words)
Hybrid Vector Embeddings
- Dense Vectors: 768-dimensional using
sentence-transformers/all-mpnet-base-v2 - Sparse Vectors: SPLADE-based with separate document and query models:
- Document:
naver/efficient-splade-VI-BT-large-doc - Query:
naver/efficient-splade-VI-BT-large-query
- Document:
- Fusion: Manual RRF (Reciprocal Rank Fusion) with k=60 constant
Error Handling and Fallbacks
SmolDocling Model Loading
- Failure Handling: Graceful degradation with disabled OCR functionality
- Backend Switching: Auto-fallback from MLX to Transformers on compatibility issues
- Retry Logic: Multiple prompt attempts for DocTags extraction
Asset Management
- S3 Download: Automatic retry with error logging
- Local Caching: Images cached in timestamped temp directories
- Cleanup: Temp files removed after processing (configurable by LOG_LEVEL)
Content Validation
- MDX Validation: OCR output compared against source content
- Image Mapping: False positive detection for picture elements
- Metadata Preservation: Page numbers and structure maintained throughout pipeline
Running the Tests
The project includes 80+ comprehensive tests and analysis scripts:
# Run all tests
pytest tests/
# Run specific test categories
pytest tests/test_client.py # MCP client integration tests
pytest tests/test_hybrid_search.py # Hybrid search validation
pytest tests/test_collection_*.py # Configurable collection tests
pytest tests/test_*_chat_history.py # Chat history tests
# Run image analysis scripts
python tests/enhance_pictures.py # Image enhancement validation
python tests/analyze_*.py # Various analysis scripts
Key test files:
test_client.py: MCP client integration and tool validationtest_hybrid_search.py: Qdrant hybrid search functionalitytest_collection_*.py: Configurable collection parameter testingtest_*_chat_history.py: Chat history logging and retrievalenhance_pictures.py: Vision model and image enhancement validationanalyze_*.py: Image analysis and validation scripts
Configurable Collections
The MCP server supports configurable collections for multi-tenant knowledge organization:
- Auto-Creation: Collections are automatically created when specified if they don't exist
- Template-Based: New collections inherit vector configurations from existing collections
- Backward Compatibility: Tools work without collection parameter (defaults to "solution-docs")
- Collection Parameter: Optional
collectionparameter available foringest_documentsandsearch_knowledge
Platform Compatibility & Configuration
SmolDocling Backend Selection
Controlled by the SMOLDOCLING_BACKEND environment variable:
mlx: MLX backend (Apple Silicon optimized)transformers: HuggingFace Transformers backend (cross-platform)auto: Auto-select based on platform (recommended)
Vision Model Integration
- Supports vision language models for automatic image description generation
- Uses transformers library with vision models for AI-generated descriptions
- Fallback mechanisms when vision models are unavailable
Vector Store Configuration
- Qdrant: High-performance hybrid search with dense + sparse vectors
- Dynamic collection management with configurable parameters
- Template-based collection creation and automatic configuration inheritance
- Advanced metadata filtering and semantic search capabilities
Environment Configuration
Key environment variables (set in .env file):
# Vector Store Configuration
QDRANT_URL=http://localhost:6333
DOCUMENT_COLLECTION=solution-docs
# Model Configuration
SMOLDOCLING_BACKEND=auto # auto, mlx, or transformers
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
# S3 Configuration (for asset storage)
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_REGION=us-east-1
S3_BUCKET_NAME=your_bucket_name
# Server Configuration
PORT=5371
HOST=0.0.0.0
Installation & Deployment
Installation
# Clone the repository
git clone <repository-url>
cd zmp-knowledge-store-mcp-server
# Install dependencies with Poetry
poetry install
# Or install with pip
pip install -e .
Running the Server
# Set up environment variables
cp .env.example .env
# Edit .env with your configuration
# Run the FastMCP server
python zmp_knowledge_store/server_main.py
Docker Deployment
Optimized Docker build with CPU-only PyTorch for faster builds and smaller images:
# Build Docker image (optimized for production)
docker build -t zmp-knowledge-store-mcp-server .
# Run with Docker
docker run -p 5371:5371 --env-file .env zmp-knowledge-store-mcp-server
# Build and push (for CI/CD)
./k8s/build-and-push.sh
Build Optimizations:
- CPU-only PyTorch: PyTorch 2.6.0+cpu and torchvision 0.21.0+cpu for security and compatibility
- Multi-stage build: Smaller production images with optimized layers
- SSL certificate handling: Secure package downloads with trusted hosts
- Network resilience: Enhanced timeout and retry configurations
- Manual RRF: Self-contained hybrid search without external dependencies
Security Features:
- PyTorch 2.6.0+: Resolves CVE-2025-32434 security vulnerability
- CPU-only builds: Eliminates CUDA dependencies for faster, more secure deployments
- Comprehensive error handling: Robust error handling and logging throughout
Version History
v0.3.9.2 (Latest)
- โ Security Fix: PyTorch upgraded to 2.6.0+cpu (CVE-2025-32434)
- โ Compatibility: Torchvision 0.21.0+cpu compatibility
- โ
Performance: Manual RRF implementation (no
ranxdependency) - โ
Cleanup: Removed
get_collection_statswarning - โ Optimization: Enhanced Docker build with CPU-only ML stack
v0.3.9.1
- โ Security: Initial PyTorch security vulnerability fixes
- โ Compatibility: NumPy/SciPy compatibility improvements
- โ Stability: Enhanced error handling and logging
v0.3.0
- โ Features: Configurable collections and enhanced search
- โ Performance: Optimized hybrid search implementation
v0.2.9
- โ Foundation: Initial release with core MCP server functionality
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zmp_knowledge_store_mcp_server-0.4.4.tar.gz.
File metadata
- Download URL: zmp_knowledge_store_mcp_server-0.4.4.tar.gz
- Upload date:
- Size: 49.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.13.3 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec54981f09d185dd9ceb0edf955632566c69c883de35b009a1ef097f79ea11fe
|
|
| MD5 |
750e5cc6b27ef0aed9dee047731e3362
|
|
| BLAKE2b-256 |
4d599b5eaccc300404b6a8efa2f2180abe87c7af7904ea955c0def962f9ca922
|
File details
Details for the file zmp_knowledge_store_mcp_server-0.4.4-py3-none-any.whl.
File metadata
- Download URL: zmp_knowledge_store_mcp_server-0.4.4-py3-none-any.whl
- Upload date:
- Size: 48.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.13.3 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3deca3eaaf1a9408021b60434bc9c23893128c2adbf58c08b48239f794c93e52
|
|
| MD5 |
a24f353c1b158c8e91ef73f063c2c7a7
|
|
| BLAKE2b-256 |
64d6dfd3c134d103c6b723946f1943ba21bed43a84acfc6a573d675f287553c1
|