RAG and Memory tools exposed via Model Context Protocol (MCP)
Project description
rag-mem
RAG and Memory tools exposed via Model Context Protocol (MCP).
Features
- RAG (Retrieval-Augmented Generation): Semantic search over your documents
- Hybrid retrieval (vector + BM25)
- CrossEncoder reranking (optional)
- Support for PDF, Markdown, Python, JSON, Jupyter notebooks
- Memory System: Persistent fact/memory storage
- BM25-based fast search
- ChromaDB vector fallback
- Simple CRUD operations
- Multiple Embedding Providers:
- Ollama (default, requires Ollama running)
- SentenceTransformers (
pip install rag-mem[local]) - OpenAI (
pip install rag-mem[openai]) - Anthropic/Voyage (
pip install rag-mem[anthropic]) - Cohere (
pip install rag-mem[cohere])
- LLM-Agnostic: Works with any LLM client that supports MCP
Installation
# Fast install with uv (recommended)
uv pip install rag-mem
# Or with pip
pip install rag-mem
Default: Uses Ollama for embeddings (free, local, private).
# One-time Ollama setup:
ollama pull nomic-embed-text
No Ollama? Use offline embeddings instead:
pip install rag-mem[local]
export MEMORY_MCP_EMBED_PROVIDER=sentence-transformers
Quick Start
1. Initialize Configuration
memory-mcp init
This creates ~/.memory-mcp/config.toml with default settings.
2. Start the Server
# Basic server
memory-mcp serve
# With document paths for RAG
memory-mcp serve --docs ./documents ./notes
# With specific embedding provider
memory-mcp serve --embed-provider openai --embed-model text-embedding-3-small
3. Connect from Claude Desktop
Add to your Claude Desktop config (~/.config/claude/claude_desktop_config.json):
{
"mcpServers": {
"memory": {
"command": "memory-mcp",
"args": ["serve", "--docs", "/path/to/your/documents"]
}
}
}
Configuration
Configuration is loaded from (in order of precedence):
- Environment variables (prefixed with
MEMORY_MCP_) - Config file (
~/.memory-mcp/config.toml) - Default values
Environment Variables
export MEMORY_MCP_EMBED_PROVIDER=openai
export MEMORY_MCP_OPENAI_API_KEY=sk-...
export MEMORY_MCP_QDRANT_MODE=cloud
export MEMORY_MCP_QDRANT_URL=https://your-cluster.qdrant.io
export MEMORY_MCP_QDRANT_API_KEY=...
Config File Example
# ~/.memory-mcp/config.toml
# Default: Ollama (free, local)
embed_provider = "ollama"
embed_model = "nomic-embed-text"
# Alternative: Offline (pip install rag-mem[local])
# embed_provider = "sentence-transformers"
# embed_model = "all-MiniLM-L6-v2"
# RAG settings
qdrant_mode = "local"
rag_chunk_size = 700
rag_top_k = 5
Docker
# Build
docker build -t memory-mcp .
# Run with OpenAI embeddings
docker run -it --rm \
-v ./documents:/docs:ro \
-v ./data:/data \
-e MEMORY_MCP_EMBED_PROVIDER=openai \
-e MEMORY_MCP_OPENAI_API_KEY=sk-... \
memory-mcp serve --docs /docs
# Run with Ollama (requires host network for Ollama access)
docker run -it --rm \
--network host \
-v ./documents:/docs:ro \
-v ./data:/data \
memory-mcp serve --docs /docs
Available Tools
When connected via MCP, these tools are available:
RAG Tools
query_knowledge_base: Search indexed documentsquery: Search querydoc_path: Optional specific document pathtop_k: Number of results
Memory Tools
save_memory: Store text contentsave_fact: Store structured fact with metadatasearch_memories: Search stored memoriesdelete_memory: Delete by IDlist_all_memories: List all stored memories
CLI Commands
# Initialize config
memory-mcp init
# Show current config
memory-mcp config
# Start MCP server
memory-mcp serve [--docs PATH...] [--embed-provider PROVIDER]
# Index documents (without starting server)
memory-mcp index PATH... [--force]
Python API
from memory_mcp import Settings, create_server
from memory_mcp.rag import RAGPipeline
from memory_mcp.memory import MemoryStore
# Custom settings
settings = Settings(
embed_provider="openai",
openai_api_key="sk-...",
)
# Use RAG directly
pipeline = RAGPipeline(
settings=settings,
document_paths=["./docs"],
)
pipeline.index()
results = pipeline.search("How does authentication work?")
# Use memory directly
memory = MemoryStore(settings)
memory.add("User prefers dark mode")
memories = memory.search("preferences")
Custom Embedding Providers
Implement the EmbeddingProvider interface:
from memory_mcp.embeddings.base import EmbeddingProvider
class MyEmbeddings(EmbeddingProvider):
@property
def dimension(self) -> int:
return 768
def embed_documents(self, texts: list[str]) -> list[list[float]]:
# Your implementation
pass
def embed_query(self, text: str) -> list[float]:
# Your implementation
pass
Architecture
memory-mcp/
├── embeddings/ # Pluggable embedding providers
├── rag/ # RAG pipeline (Qdrant + BM25 + reranking)
├── memory/ # Memory store (ChromaDB + BM25)
├── server.py # FastMCP server
├── config.py # Pydantic settings
└── cli.py # CLI entry point
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rag_mem-0.1.3.tar.gz.
File metadata
- Download URL: rag_mem-0.1.3.tar.gz
- Upload date:
- Size: 18.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7de040fdd78a2121f00a2ff8ec5592319106a428ea3a5f45a4601b1720a1f523
|
|
| MD5 |
8b7a037659e6270ed20979c2155c8587
|
|
| BLAKE2b-256 |
281d06cd26069ca47f6b0e7b994896b76412d8025d56bd6392fb266d27a1e9b8
|
Provenance
The following attestation bundles were made for rag_mem-0.1.3.tar.gz:
Publisher:
memory-mcp-publish.yml on MasihMoafi/A-Modular-Kingdom
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rag_mem-0.1.3.tar.gz -
Subject digest:
7de040fdd78a2121f00a2ff8ec5592319106a428ea3a5f45a4601b1720a1f523 - Sigstore transparency entry: 716850820
- Sigstore integration time:
-
Permalink:
MasihMoafi/A-Modular-Kingdom@1aefdf2ed42a0cfba99c205597098ea5568174d4 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/MasihMoafi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
memory-mcp-publish.yml@1aefdf2ed42a0cfba99c205597098ea5568174d4 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file rag_mem-0.1.3-py3-none-any.whl.
File metadata
- Download URL: rag_mem-0.1.3-py3-none-any.whl
- Upload date:
- Size: 25.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a02d7c35a79762b82dcf9d564d2aaa5932b0386f1512d9c7b42d368deafd81b
|
|
| MD5 |
156b91b911bab63c382fe77c3fa24227
|
|
| BLAKE2b-256 |
4aa51a16da2a194f2d232b8c89bad5ef4c4570f107dfe569c272e8116d701151
|
Provenance
The following attestation bundles were made for rag_mem-0.1.3-py3-none-any.whl:
Publisher:
memory-mcp-publish.yml on MasihMoafi/A-Modular-Kingdom
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rag_mem-0.1.3-py3-none-any.whl -
Subject digest:
3a02d7c35a79762b82dcf9d564d2aaa5932b0386f1512d9c7b42d368deafd81b - Sigstore transparency entry: 716850822
- Sigstore integration time:
-
Permalink:
MasihMoafi/A-Modular-Kingdom@1aefdf2ed42a0cfba99c205597098ea5568174d4 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/MasihMoafi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
memory-mcp-publish.yml@1aefdf2ed42a0cfba99c205597098ea5568174d4 -
Trigger Event:
workflow_dispatch
-
Statement type: