A Python package for accessing Solr indexes via Model Context Protocol (MCP)
Project description
Solr MCP
A Python package for accessing Apache Solr indexes via Model Context Protocol (MCP). This integration allows AI assistants like Claude to perform powerful search queries against your Solr indexes, combining both keyword and vector search capabilities.
Features
- MCP Server: Implements the Model Context Protocol for integration with AI assistants
- Hybrid Search: Combines keyword search precision with vector search semantic understanding
- Vector Embeddings: Generates embeddings for documents using Ollama with nomic-embed-text
- Unified Collections: Store both document content and vector embeddings in the same collection
- Docker Integration: Easy setup with Docker and docker-compose
- Optimized Vector Search: Efficiently handles combined vector and SQL queries by pushing down SQL filters to the vector search stage, ensuring optimal performance even with large result sets and pagination
Architecture
Vector Search Optimization
The system employs an important optimization for combined vector and SQL queries. When executing a query that includes both vector similarity search and SQL filters:
- SQL filters (WHERE clauses) are pushed down to the vector search stage
- This ensures that vector similarity calculations are only performed on documents that will match the final SQL criteria
- Significantly improves performance for queries with:
- Selective WHERE clauses
- Pagination (LIMIT/OFFSET)
- Large result sets
This optimization reduces computational overhead and network transfer by minimizing the number of vector similarity calculations needed.
Quick Start
- Clone this repository
- Start SolrCloud with Docker:
docker-compose up -d
- Install dependencies:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install poetry poetry install
- Process and index the sample document:
python scripts/process_markdown.py data/bitcoin-whitepaper.md --output data/processed/bitcoin_sections.json python scripts/create_unified_collection.py unified python scripts/unified_index.py data/processed/bitcoin_sections.json --collection unified
- Run the MCP server:
poetry run python -m solr_mcp.server
For more detailed setup and usage instructions, see the QUICKSTART.md guide.
Requirements
- Python 3.10 or higher
- Docker and Docker Compose
- SolrCloud 9.x
- Ollama (for embedding generation)
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mseep_solr_mcp-0.1.1.tar.gz.
File metadata
- Download URL: mseep_solr_mcp-0.1.1.tar.gz
- Upload date:
- Size: 35.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7af11a1d18c348054ab5c6fe6cd0af042084ebbef486afbfb5c4a167dd097f18
|
|
| MD5 |
1ed456ab6b2f39347e6aa5aba34aae93
|
|
| BLAKE2b-256 |
a940c125d6caddb157dcd39359ab00eb5945b5c39ad99da3b879164e3dd5ff3d
|
File details
Details for the file mseep_solr_mcp-0.1.1-py3-none-any.whl.
File metadata
- Download URL: mseep_solr_mcp-0.1.1-py3-none-any.whl
- Upload date:
- Size: 51.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f14d6a4a77e6397f921bca08618ff53cdc00bdf7b5b2b950b1f32092832b84a7
|
|
| MD5 |
42b25a361ab959678844628e5cdae07d
|
|
| BLAKE2b-256 |
cb7ac69a51d4ed6e1fe6583a2e331d986a371c652008008aa179801fb08c5c50
|