Skip to main content

MCP server with FAISS local vector database for RAG

Project description

Local FAISS MCP Server

License: MIT Python 3.10+ Tests

A Model Context Protocol (MCP) server that provides local vector database functionality using FAISS for Retrieval-Augmented Generation (RAG) applications.

demo

Features

  • Local Vector Storage: Uses FAISS for efficient similarity search without external dependencies
  • Document Ingestion: Automatically chunks and embeds documents for storage
  • Semantic Search: Query documents using natural language with sentence embeddings
  • Persistent Storage: Indexes and metadata are saved to disk
  • MCP Compatible: Works with any MCP-compatible AI agent or client

Quickstart

pip install local-faiss-mcp

Then configure your MCP client (see Configuration) and try your first query in Claude:

Use the query_rag_store tool to search for: "How does FAISS perform similarity search?"

Claude will retrieve relevant document chunks from your vector store and use them to answer your question.

Installation

From PyPI (Recommended)

pip install local-faiss-mcp

From Source

git clone https://github.com/nonatofabio/local_faiss_mcp.git
cd local_faiss_mcp
pip install -e .

Usage

Running the Server

After installation, you can run the server in three ways:

1. Using the installed command (easiest):

local-faiss-mcp --index-dir /path/to/index/directory

2. As a Python module:

python -m local_faiss_mcp --index-dir /path/to/index/directory

3. For development/testing:

python local_faiss_mcp/server.py --index-dir /path/to/index/directory

Command-line Arguments:

  • --index-dir: Directory to store FAISS index and metadata files (default: current directory)
  • --embed: Hugging Face embedding model name (default: all-MiniLM-L6-v2)
  • --rerank: Enable re-ranking with specified cross-encoder model (default: BAAI/bge-reranker-base)

Using a Custom Embedding Model:

# Use a larger, more accurate model
local-faiss-mcp --index-dir ./.vector_store --embed all-mpnet-base-v2

# Use a multilingual model
local-faiss-mcp --index-dir ./.vector_store --embed paraphrase-multilingual-MiniLM-L12-v2

# Use any Hugging Face sentence-transformers model
local-faiss-mcp --index-dir ./.vector_store --embed sentence-transformers/model-name

Using Re-ranking for Better Results:

Re-ranking uses a cross-encoder model to reorder FAISS results for improved relevance. This two-stage "retrieve and rerank" approach is common in production search systems.

# Enable re-ranking with default model (BAAI/bge-reranker-base)
local-faiss-mcp --index-dir ./.vector_store --rerank

# Use a specific re-ranking model
local-faiss-mcp --index-dir ./.vector_store --rerank cross-encoder/ms-marco-MiniLM-L-6-v2

# Combine custom embedding and re-ranking
local-faiss-mcp --index-dir ./.vector_store --embed all-mpnet-base-v2 --rerank BAAI/bge-reranker-base

How Re-ranking Works:

  1. FAISS retrieves top candidates (10x more than requested)
  2. Cross-encoder scores each candidate against the query
  3. Results are re-sorted by relevance score
  4. Top-k most relevant results are returned

Popular re-ranking models:

  • BAAI/bge-reranker-base - Good balance (default)
  • cross-encoder/ms-marco-MiniLM-L-6-v2 - Fast and efficient
  • cross-encoder/ms-marco-TinyBERT-L-2-v2 - Very fast, smaller model

The server will:

  • Create the index directory if it doesn't exist
  • Load existing FAISS index from {index-dir}/faiss.index (or create a new one)
  • Load document metadata from {index-dir}/metadata.json (or create new)
  • Listen for MCP tool calls via stdin/stdout

Available Tools

The server provides two tools for document management:

1. ingest_document

Ingest a document into the vector store.

Parameters:

  • document (required): The text content to ingest
  • source (optional): Identifier for the document source (default: "unknown")

Example:

{
  "document": "FAISS is a library for efficient similarity search...",
  "source": "faiss_docs.txt"
}

2. query_rag_store

Query the vector store for relevant document chunks.

Parameters:

  • query (required): The search query text
  • top_k (optional): Number of results to return (default: 3)

Example:

{
  "query": "How does FAISS perform similarity search?",
  "top_k": 5
}

Available Prompts

The server provides MCP prompts to help extract answers and summarize information from retrieved documents:

1. extract-answer

Extract the most relevant answer from retrieved document chunks with proper citations.

Arguments:

  • query (required): The original user query or question
  • chunks (required): Retrieved document chunks as JSON array with fields: text, source, distance

Use Case: After querying the RAG store, use this prompt to get a well-formatted answer that cites sources and explains relevance.

Example workflow in Claude:

  1. Use query_rag_store tool to retrieve relevant chunks
  2. Use extract-answer prompt with the query and results
  3. Get a comprehensive answer with citations

2. summarize-documents

Create a focused summary from multiple document chunks.

Arguments:

  • topic (required): The topic or theme to summarize
  • chunks (required): Document chunks to summarize as JSON array
  • max_length (optional): Maximum summary length in words (default: 200)

Use Case: Synthesize information from multiple retrieved documents into a concise summary.

Example Usage:

In Claude Code, after retrieving documents with query_rag_store, you can use the prompts like:

Use the extract-answer prompt with:
- query: "What is FAISS?"
- chunks: [the JSON results from query_rag_store]

The prompts will guide the LLM to provide structured, citation-backed answers based on your vector store data.

Configuration with MCP Clients

Claude Code

Add this server to your Claude Code MCP configuration (.mcp.json):

User-wide configuration (~/.claude/.mcp.json):

{
  "mcpServers": {
    "local-faiss-mcp": {
      "command": "local-faiss-mcp"
    }
  }
}

With custom index directory:

{
  "mcpServers": {
    "local-faiss-mcp": {
      "command": "local-faiss-mcp",
      "args": [
        "--index-dir",
        "/home/user/vector_indexes/my_project"
      ]
    }
  }
}

With custom embedding model:

{
  "mcpServers": {
    "local-faiss-mcp": {
      "command": "local-faiss-mcp",
      "args": [
        "--index-dir",
        "./.vector_store",
        "--embed",
        "all-mpnet-base-v2"
      ]
    }
  }
}

With re-ranking enabled:

{
  "mcpServers": {
    "local-faiss-mcp": {
      "command": "local-faiss-mcp",
      "args": [
        "--index-dir",
        "./.vector_store",
        "--rerank"
      ]
    }
  }
}

Full configuration with embedding and re-ranking:

{
  "mcpServers": {
    "local-faiss-mcp": {
      "command": "local-faiss-mcp",
      "args": [
        "--index-dir",
        "./.vector_store",
        "--embed",
        "all-mpnet-base-v2",
        "--rerank",
        "BAAI/bge-reranker-base"
      ]
    }
  }
}

Project-specific configuration (./.mcp.json in your project):

{
  "mcpServers": {
    "local-faiss-mcp": {
      "command": "local-faiss-mcp",
      "args": [
        "--index-dir",
        "./.vector_store"
      ]
    }
  }
}

Alternative: Using Python module (if the command isn't in PATH):

{
  "mcpServers": {
    "local-faiss-mcp": {
      "command": "python",
      "args": ["-m", "local_faiss_mcp", "--index-dir", "./.vector_store"]
    }
  }
}

Claude Desktop

Add this server to your Claude Desktop configuration:

{
  "mcpServers": {
    "local-faiss-mcp": {
      "command": "local-faiss-mcp",
      "args": ["--index-dir", "/path/to/index/directory"]
    }
  }
}

Architecture

  • Embedding Model: Configurable via --embed flag (default: all-MiniLM-L6-v2 with 384 dimensions)
    • Supports any Hugging Face sentence-transformers model
    • Automatically detects embedding dimensions
    • Model choice persisted with the index
  • Index Type: FAISS IndexFlatL2 for exact L2 distance search
  • Chunking: Documents are split into ~500 word chunks with 50 word overlap
  • Storage: Index saved as faiss.index, metadata saved as metadata.json

Choosing an Embedding Model

Different models offer different trade-offs:

Model Dimensions Speed Quality Use Case
all-MiniLM-L6-v2 384 Fast Good Default, balanced performance
all-mpnet-base-v2 768 Medium Better Higher quality embeddings
paraphrase-multilingual-MiniLM-L12-v2 384 Fast Good Multilingual support
all-MiniLM-L12-v2 384 Medium Better Better quality at same size

Important: Once you create an index with a specific model, you must use the same model for subsequent runs. The server will detect dimension mismatches and warn you.

Development

Standalone Test

Test the FAISS vector store functionality without MCP infrastructure:

source venv/bin/activate
python test_standalone.py

This test:

  • Initializes the vector store
  • Ingests sample documents
  • Performs semantic search queries
  • Tests persistence and reload
  • Cleans up test files

Unit Tests

Run the complete test suite:

pytest tests/ -v

Run specific test files:

# Test embedding model functionality
pytest tests/test_embedding_models.py -v

# Run standalone integration test
python tests/test_standalone.py

The test suite includes:

  • test_embedding_models.py: Comprehensive tests for custom embedding models, dimension detection, and compatibility
  • test_standalone.py: End-to-end integration test without MCP infrastructure

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

local_faiss_mcp-0.2.0rc3.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

local_faiss_mcp-0.2.0rc3-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file local_faiss_mcp-0.2.0rc3.tar.gz.

File metadata

  • Download URL: local_faiss_mcp-0.2.0rc3.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.4

File hashes

Hashes for local_faiss_mcp-0.2.0rc3.tar.gz
Algorithm Hash digest
SHA256 da119e2567fed11b1d7f9d6b53bd2a7d67fbd8a4a2f97799776e19fdbd93bd61
MD5 5816fa18d4ee016fc557783abd9f16d0
BLAKE2b-256 802c682f3a799e4ee0dbce4c8042775499cd42ee61a00d3572d6c3e0e4a9c786

See more details on using hashes here.

File details

Details for the file local_faiss_mcp-0.2.0rc3-py3-none-any.whl.

File metadata

File hashes

Hashes for local_faiss_mcp-0.2.0rc3-py3-none-any.whl
Algorithm Hash digest
SHA256 11266576f4da3aba1fc50d15354623c4f02d87eb9cb53b0e9297750755ca96a7
MD5 f33320f38027901f271eb13947980f12
BLAKE2b-256 55ac659d4cefe0d0d789b7c6dbb5821ef2decd36ee85a3d48ad9f225177e6445

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page