MCP server for reading LlamaIndex documents stored in Qdrant vector database

These details have not been verified by PyPI

Project links

Project description

qdrant-llamaindex-mcp-server: LlamaIndex-Compatible Qdrant MCP Server

The Model Context Protocol (MCP) is an open protocol that enables seamless integration between LLM applications and external data sources and tools. Whether you're building an AI-powered IDE, enhancing a chat interface, or creating custom AI workflows, MCP provides a standardized way to connect LLMs with the context they need.

This repository is a fork of qdrant/mcp-server-qdrant specifically designed to work with documents stored by LlamaIndex in Qdrant vector databases.

⚠️ Important Differences from Official Server

This fork has breaking changes compared to the official qdrant/mcp-server-qdrant:

🔧 Many More Tools: Provides 10+ tools vs. the official server's basic find/store tools
🎯 Dynamic Collection Selection: Collection names are specified at runtime by MCP clients, not hardcoded in configuration
🤖 Dynamic Embedding Model Detection: Automatically detects and loads the correct embedding model for each collection
📚 LlamaIndex Compatibility: Adapts to different content field names and metadata structures used by LlamaIndex
🔒 Enhanced Security: Built-in embedding model whitelist to prevent accidental loading of large models

These changes make configurations incompatible with the official server. You cannot simply swap this server for the official one without updating your configuration and workflow.

Overview

A comprehensive Model Context Protocol server for working with documents stored by LlamaIndex in Qdrant vector databases. Unlike the original server which provides basic functionality with a fixed document structure, this version offers extensive tooling and automatically adapts to different payload formats used by LlamaIndex.

Key Features

LlamaIndex Compatibility: Automatically detects and adapts to different content field names (text, document, _node_content, etc.)
Dynamic Embedding Model Detection: Automatically detects and uses the correct embedding model for each collection based on its vector configuration
Embedding Model Whitelist: Built-in safety mechanism to prevent accidentally loading large models
Flexible Metadata Handling: Works with both flat and nested metadata structures
Read-Only Access: Designed specifically for querying existing LlamaIndex-indexed data
Smart Content Detection: Automatically identifies the most likely content field when standard names aren't found

Tools

Read-Only Tools (Available with `QDRANT_READ_ONLY=true`)

qdrant-find - Search and retrieve documents stored by LlamaIndex in Qdrant
- query (string): Semantic search query
- collection_name (string): Name of the collection to search
- Returns: Relevant documents with content and metadata
qdrant-get-point - Get a specific point by its ID
- point_id (string): The ID of the point to retrieve
- collection_name (string): The collection to get the point from
- Returns: Point information with content and metadata
qdrant-get-collections - Get a list of all collections
- Returns: Array of collection names in the Qdrant server
qdrant-get-collection-details - Get detailed information about a collection
- collection_name (string): The name of the collection
- Returns: Collection configuration, statistics, and status
qdrant-get-collection-count - Get the number of points in a collection
- collection_name (string): The name of the collection
- Returns: Number of points in the collection
qdrant-peek-collection - Preview sample points from a collection
- collection_name (string): The name of the collection
- limit (int, optional): Maximum number of points to return (default: 10)
- Returns: Sample points from the collection
qdrant-get-documents - Retrieve multiple documents by their IDs
- point_ids (array of strings): List of point IDs to retrieve
- collection_name (string): The collection to get documents from
- Returns: Array of found documents
qdrant-search-by-vector - Search using a raw vector instead of text query
- vector (array of floats): The query vector to search with
- collection_name (string): The collection to search in
- limit (int, optional): Maximum number of results to return (default: 10)
- Returns: Relevant documents based on vector similarity
qdrant-list-document-ids - List document IDs with pagination
- collection_name (string): The collection to list IDs from
- limit (int, optional): Maximum number of IDs to return (default: 100)
- offset (int, optional): Number of IDs to skip for pagination (default: 0)
- Returns: Array of document IDs
qdrant-scroll-points - Paginated retrieval of points using scroll
- collection_name (string): The collection to scroll through
- limit (int, optional): Maximum number of points to return (default: 10)
- offset (int, optional): Offset for pagination
- Returns: Points with pagination info

Write Tools (Available when `QDRANT_READ_ONLY=false`)

When read-only mode is disabled, additional tools become available for modifying data:

qdrant-store - Store new documents in Qdrant
qdrant-delete-point - Delete a specific point by ID
qdrant-update-point-payload - Update point metadata
qdrant-create-collection - Create new collections
qdrant-delete-collection - Delete entire collections
qdrant-add-documents - Batch add multiple documents
qdrant-delete-documents - Batch delete multiple documents

Environment Variables

The configuration of the server is done using environment variables:

Name	Description	Default Value
`QDRANT_URL`	URL of the Qdrant server	None
`QDRANT_API_KEY`	API key for the Qdrant server	None
`COLLECTION_NAME`	Deprecated: Collection names are now specified dynamically by MCP clients at runtime	None
`QDRANT_READ_ONLY`	Enable read-only mode (disables write tools for safety)	`false`
`QDRANT_LOCAL_PATH`	Path to the local Qdrant database (alternative to `QDRANT_URL`)	None
`EMBEDDING_PROVIDER`	Embedding provider to use (currently only "fastembed" is supported)	`fastembed`
`EMBEDDING_MODEL`	Default embedding model (used as fallback when auto-detection fails or model not in whitelist)	`sentence-transformers/all-MiniLM-L6-v2`
`EMBEDDING_ALLOWED_MODELS`	JSON array of allowed embedding models for dynamic loading	`["sentence-transformers/all-MiniLM-L6-v2", "BAAI/bge-small-en-v1.5", "snowflake/snowflake-arctic-embed-xs", "jinaai/jina-embeddings-v2-small-en"]`
`TOOL_FIND_DESCRIPTION`	Custom description for the find tool	See default in `settings.py`

Note: You cannot provide both QDRANT_URL and QDRANT_LOCAL_PATH at the same time.

[!IMPORTANT] Command-line arguments are not supported anymore! Please use environment variables for all configuration.

FastMCP Environment Variables

Since mcp-server-qdrant is based on FastMCP, it also supports all the FastMCP environment variables. The most important ones are listed below:

Environment Variable	Description	Default Value
`FASTMCP_DEBUG`	Enable debug mode	`false`
`FASTMCP_LOG_LEVEL`	Set logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)	`INFO`
`FASTMCP_HOST`	Host address to bind the server to	`127.0.0.1`
`FASTMCP_PORT`	Port to run the server on	`8000`
`FASTMCP_WARN_ON_DUPLICATE_RESOURCES`	Show warnings for duplicate resources	`true`
`FASTMCP_WARN_ON_DUPLICATE_TOOLS`	Show warnings for duplicate tools	`true`
`FASTMCP_WARN_ON_DUPLICATE_PROMPTS`	Show warnings for duplicate prompts	`true`
`FASTMCP_DEPENDENCIES`	List of dependencies to install in the server environment	`[]`

Dynamic Embedding Model Detection

This server automatically detects which embedding model was used for each collection and uses the appropriate model for queries. This is especially useful when you have multiple collections created with different embedding models.

How It Works

Collection Creation: When LlamaIndex creates a collection, the full model name is stored as the vector name (e.g., "BAAI/bge-small-en-v1.5")
Query Time: When searching a collection, the server:
- Inspects the collection's vector configuration
- Extracts the model name from the vector name
- Loads the appropriate embedding model (with caching for performance)
- Uses that model to embed the query

Embedding Model Whitelist

For security and resource management, the server includes a built-in whitelist of allowed embedding models. By default, only small, efficient models are permitted:

sentence-transformers/all-MiniLM-L6-v2 (384 dims, ~90MB)
BAAI/bge-small-en-v1.5 (384 dims, ~67MB)
snowflake/snowflake-arctic-embed-xs (384 dims, ~90MB)
jinaai/jina-embeddings-v2-small-en (512 dims, ~120MB)

Customizing the Whitelist

Using Environment Variables

# Allow only specific models
export EMBEDDING_ALLOWED_MODELS='["sentence-transformers/all-MiniLM-L6-v2", "BAAI/bge-small-en-v1.5"]'

# Allow all models (removes safety protection)
export EMBEDDING_ALLOWED_MODELS='null'

In Claude Desktop Config

{
  "mcpServers": {
    "qdrant": {
      "command": "uvx",
      "args": ["qdrant-llamaindex-mcp-server"],
      "env": {
        "QDRANT_URL": "http://localhost:6333",
        "COLLECTION_NAME": "your-collection",
        "EMBEDDING_ALLOWED_MODELS": "[\"sentence-transformers/all-MiniLM-L6-v2\", \"BAAI/bge-small-en-v1.5\"]"
      }
    }
  }
}

Behavior with Blocked Models

When the server encounters a collection using a model not in the whitelist:

⚠️ Logs a warning message
🔄 Falls back to the default configured model (EMBEDDING_MODEL)
✅ Continues operating normally

This ensures your system remains stable while preventing accidental downloads of large models.

Installation

Using uvx (Recommended)

From PyPI

QDRANT_URL="http://localhost:6333" \
uvx qdrant-llamaindex-mcp-server

From GitHub Repository (Development)

QDRANT_URL="http://localhost:6333" \
uvx --from git+https://github.com/azhang/qdrant-llamaindex-mcp-server.git qdrant-llamaindex-mcp-server

From Local Directory (Development)

# Clone and run locally
git clone https://github.com/azhang/qdrant-llamaindex-mcp-server.git
cd qdrant-llamaindex-mcp-server

QDRANT_URL="http://localhost:6333" \
uvx --from . qdrant-llamaindex-mcp-server

Transport Protocols

The server supports different transport protocols that can be specified using the --transport flag:

QDRANT_URL="http://localhost:6333" \
uvx qdrant-llamaindex-mcp-server --transport sse

Supported transport protocols:

stdio (default): Standard input/output transport, might only be used by local MCP clients
sse: Server-Sent Events transport, perfect for remote clients
streamable-http: Streamable HTTP transport, perfect for remote clients, more recent than SSE

The default transport is stdio if not specified.

When SSE transport is used, the server will listen on the specified port and wait for incoming connections. The default port is 8000, however it can be changed using the FASTMCP_PORT environment variable.

QDRANT_URL="http://localhost:6333" \
FASTMCP_PORT=1234 \
uvx qdrant-llamaindex-mcp-server --transport sse

Using Docker

A Dockerfile is available for building and running the MCP server:

# Build the container
docker build -t mcp-server-qdrant .

# Run the container
docker run -p 8000:8000 \
  -e FASTMCP_HOST="0.0.0.0" \
  -e QDRANT_URL="http://your-qdrant-server:6333" \
  -e QDRANT_API_KEY="your-api-key" \
  -e COLLECTION_NAME="your-collection" \
  mcp-server-qdrant

[!TIP] Please note that we set FASTMCP_HOST="0.0.0.0" to make the server listen on all network interfaces. This is necessary when running the server in a Docker container.

Installing via Smithery

To install Qdrant MCP Server for Claude Desktop automatically via Smithery:

npx @smithery/cli install mcp-server-qdrant --client claude

Manual configuration of Claude Desktop

To use this server with the Claude Desktop app, add the following configuration to the "mcpServers" section of your claude_desktop_config.json:

{
  "qdrant": {
    "command": "uvx",
    "args": ["qdrant-llamaindex-mcp-server"],
    "env": {
      "QDRANT_URL": "https://xyz-example.eu-central.aws.cloud.qdrant.io:6333",
      "QDRANT_API_KEY": "your_api_key",
      "QDRANT_READ_ONLY": "true"
    }
  }
}

For local Qdrant mode:

{
  "qdrant": {
    "command": "uvx",
    "args": ["qdrant-llamaindex-mcp-server"],
    "env": {
      "QDRANT_LOCAL_PATH": "/path/to/qdrant/database",
      "QDRANT_READ_ONLY": "true"
    }
  }
}

[!NOTE] Collection Names: Collection names are now specified dynamically when using the tools (e.g., when calling qdrant-find, you specify which collection to search). This provides more flexibility than the previous approach of hardcoding a single collection name.

By default, the server will use the sentence-transformers/all-MiniLM-L6-v2 embedding model to encode memories. For the time being, only FastEmbed models are supported.

Support for other tools

This MCP server can be used with any MCP-compatible client. For example, you can use it with Cursor and VS Code, which provide built-in support for the Model Context Protocol.

Using with Cursor/Windsurf

You can configure this MCP server to work as a code search tool for Cursor or Windsurf by customizing the tool descriptions:

QDRANT_URL="http://localhost:6333" \
TOOL_STORE_DESCRIPTION="Store reusable code snippets for later retrieval. \
The 'information' parameter should contain a natural language description of what the code does, \
while the actual code should be included in the 'metadata' parameter as a 'code' property. \
The value of 'metadata' is a Python dictionary with strings as keys. \
Use this whenever you generate some code snippet." \
TOOL_FIND_DESCRIPTION="Search for relevant code snippets based on natural language descriptions. \
The 'query' parameter should describe what you're looking for, \
and the tool will return the most relevant code snippets. \
Use this when you need to find existing code snippets for reuse or reference." \
uvx qdrant-llamaindex-mcp-server --transport sse # Enable SSE transport

In Cursor/Windsurf, you can then configure the MCP server in your settings by pointing to this running server using SSE transport protocol. The description on how to add an MCP server to Cursor can be found in the Cursor documentation. If you are running Cursor/Windsurf locally, you can use the following URL:

http://localhost:8000/sse

[!TIP] We suggest SSE transport as a preferred way to connect Cursor/Windsurf to the MCP server, as it can support remote connections. That makes it easy to share the server with your team or use it in a cloud environment.

This configuration transforms the Qdrant MCP server into a specialized code search tool that can:

Store code snippets, documentation, and implementation details
Retrieve relevant code examples based on semantic search
Help developers find specific implementations or usage patterns

You can populate the database by storing natural language descriptions of code snippets (in the information parameter) along with the actual code (in the metadata.code property), and then search for them using natural language queries that describe what you're looking for.

[!NOTE] The tool descriptions provided above are examples and may need to be customized for your specific use case. Consider adjusting the descriptions to better match your team's workflow and the specific types of code snippets you want to store and retrieve.

If you have successfully installed the mcp-server-qdrant, but still can't get it to work with Cursor, please consider creating the Cursor rules so the MCP tools are always used when the agent produces a new code snippet. You can restrict the rules to only work for certain file types, to avoid using the MCP server for the documentation or other types of content.

Using with Claude Code

You can enhance Claude Code's capabilities by connecting it to this MCP server, enabling semantic search over your existing codebase.

Setting up qdrant-llamaindex-mcp-server

Add the MCP server to Claude Code:

# Add qdrant-llamaindex-mcp-server configured for code search
claude mcp add code-search \
-e QDRANT_URL="http://localhost:6333" \
-e QDRANT_READ_ONLY="true" \
-e TOOL_STORE_DESCRIPTION="Store code snippets with descriptions. The 'information' parameter should contain a natural language description of what the code does, while the actual code should be included in the 'metadata' parameter as a 'code' property." \
-e TOOL_FIND_DESCRIPTION="Search for relevant code snippets using natural language. The 'query' parameter should describe the functionality you're looking for." \
-- uvx qdrant-llamaindex-mcp-server

Verify the server was added:
```
claude mcp list
```

Using Semantic Code Search in Claude Code

Tool descriptions, specified in TOOL_STORE_DESCRIPTION and TOOL_FIND_DESCRIPTION, guide Claude Code on how to use the MCP server. The ones provided above are examples and may need to be customized for your specific use case. However, Claude Code should be already able to:

Use the qdrant-store tool to store code snippets with descriptions.
Use the qdrant-find tool to search for relevant code snippets using natural language.

Run MCP server in Development Mode

The MCP server can be run in development mode using the mcp dev command. This will start the server and open the MCP inspector in your browser.

fastmcp dev src/mcp_server_qdrant/server.py

Using with VS Code

For one-click installation, click one of the install buttons below:

Manual Installation

Add the following JSON block to your User Settings (JSON) file in VS Code. You can do this by pressing Ctrl + Shift + P and typing Preferences: Open User Settings (JSON).

{
  "mcp": {
    "inputs": [
      {
        "type": "promptString",
        "id": "qdrantUrl",
        "description": "Qdrant URL"
      },
      {
        "type": "promptString",
        "id": "qdrantApiKey",
        "description": "Qdrant API Key",
        "password": true
      },
      {
        "type": "promptString",
        "id": "collectionName",
        "description": "Collection Name"
      }
    ],
    "servers": {
      "qdrant": {
        "command": "uvx",
        "args": ["qdrant-llamaindex-mcp-server"],
        "env": {
          "QDRANT_URL": "${input:qdrantUrl}",
          "QDRANT_API_KEY": "${input:qdrantApiKey}",
          "COLLECTION_NAME": "${input:collectionName}"
        }
      }
    }
  }
}

Or if you prefer using Docker, add this configuration instead:

{
  "mcp": {
    "inputs": [
      {
        "type": "promptString",
        "id": "qdrantUrl",
        "description": "Qdrant URL"
      },
      {
        "type": "promptString",
        "id": "qdrantApiKey",
        "description": "Qdrant API Key",
        "password": true
      },
      {
        "type": "promptString",
        "id": "collectionName",
        "description": "Collection Name"
      }
    ],
    "servers": {
      "qdrant": {
        "command": "docker",
        "args": [
          "run",
          "-p", "8000:8000",
          "-i",
          "--rm",
          "-e", "QDRANT_URL",
          "-e", "QDRANT_API_KEY",
          "-e", "COLLECTION_NAME",
          "qdrant-llamaindex-mcp-server"
        ],
        "env": {
          "QDRANT_URL": "${input:qdrantUrl}",
          "QDRANT_API_KEY": "${input:qdrantApiKey}",
          "COLLECTION_NAME": "${input:collectionName}"
        }
      }
    }
  }
}

Alternatively, you can create a .vscode/mcp.json file in your workspace with the following content:

{
  "inputs": [
    {
      "type": "promptString",
      "id": "qdrantUrl",
      "description": "Qdrant URL"
    },
    {
      "type": "promptString",
      "id": "qdrantApiKey",
      "description": "Qdrant API Key",
      "password": true
    },
    {
      "type": "promptString",
      "id": "collectionName",
      "description": "Collection Name"
    }
  ],
  "servers": {
    "qdrant": {
      "command": "uvx",
      "args": ["qdrant-llamaindex-mcp-server"],
      "env": {
        "QDRANT_URL": "${input:qdrantUrl}",
        "QDRANT_API_KEY": "${input:qdrantApiKey}",
        "COLLECTION_NAME": "${input:collectionName}"
      }
    }
  }
}

For workspace configuration with Docker, use this in .vscode/mcp.json:

{
  "inputs": [
    {
      "type": "promptString",
      "id": "qdrantUrl",
      "description": "Qdrant URL"
    },
    {
      "type": "promptString",
      "id": "qdrantApiKey",
      "description": "Qdrant API Key",
      "password": true
    },
    {
      "type": "promptString",
      "id": "collectionName",
      "description": "Collection Name"
    }
  ],
  "servers": {
    "qdrant": {
      "command": "docker",
      "args": [
        "run",
        "-p", "8000:8000",
        "-i",
        "--rm",
        "-e", "QDRANT_URL",
        "-e", "QDRANT_API_KEY",
        "-e", "COLLECTION_NAME",
        "qdrant-llamaindex-mcp-server"
      ],
      "env": {
        "QDRANT_URL": "${input:qdrantUrl}",
        "QDRANT_API_KEY": "${input:qdrantApiKey}",
        "COLLECTION_NAME": "${input:collectionName}"
      }
    }
  }
}

Contributing

If you have suggestions for how mcp-server-qdrant could be improved, or want to report a bug, open an issue! We'd love all and any contributions.

Testing `qdrant-llamaindex-mcp-server` locally

The MCP inspector is a developer tool for testing and debugging MCP servers. It runs both a client UI (default port 5173) and an MCP proxy server (default port 3000). Open the client UI in your browser to use the inspector.

QDRANT_URL=":memory:" \
fastmcp dev src/mcp_server_qdrant/server.py

Once started, open your browser to http://localhost:5173 to access the inspector interface.

License

This MCP server is licensed under the Apache License 2.0. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the Apache License 2.0. For more details, please see the LICENSE file in the project repository.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

Aug 7, 2025

This version

0.1.1

Aug 7, 2025

0.1.0

Aug 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qdrant_llamaindex_mcp_server-0.1.1.tar.gz (134.9 kB view details)

Uploaded Aug 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

qdrant_llamaindex_mcp_server-0.1.1-py3-none-any.whl (31.8 kB view details)

Uploaded Aug 7, 2025 Python 3

File details

Details for the file qdrant_llamaindex_mcp_server-0.1.1.tar.gz.

File metadata

Download URL: qdrant_llamaindex_mcp_server-0.1.1.tar.gz
Upload date: Aug 7, 2025
Size: 134.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.21

File hashes

Hashes for qdrant_llamaindex_mcp_server-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`0c711d3300723121b3fa1ad6274b33980cfe18ef0fd5fef1096aab25f6dcdc8e`
MD5	`08c8849dad638331a6952c66f0b96f86`
BLAKE2b-256	`759ab4f9d594f26641ced4f543be768e83fe571a9f9faf50c02cd4d86a9cad6e`

See more details on using hashes here.

File details

Details for the file qdrant_llamaindex_mcp_server-0.1.1-py3-none-any.whl.

File metadata

Download URL: qdrant_llamaindex_mcp_server-0.1.1-py3-none-any.whl
Upload date: Aug 7, 2025
Size: 31.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.21

File hashes

Hashes for qdrant_llamaindex_mcp_server-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`af0b90dc9a837c3398450fa59bd66d845685907ce9943c4a01b2e731481ad37a`
MD5	`777d1c00fbdcc2b19ce1d2d4f67839a0`
BLAKE2b-256	`71b986984c66877f80b2d5b108b8627262024d7a4b77b0339af73904014a8e97`

See more details on using hashes here.

qdrant-llamaindex-mcp-server 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

qdrant-llamaindex-mcp-server: LlamaIndex-Compatible Qdrant MCP Server

⚠️ Important Differences from Official Server

Overview

Key Features

Tools

Read-Only Tools (Available with QDRANT_READ_ONLY=true)

Write Tools (Available when QDRANT_READ_ONLY=false)

Environment Variables

FastMCP Environment Variables

Dynamic Embedding Model Detection

How It Works

Embedding Model Whitelist

Customizing the Whitelist

Using Environment Variables

In Claude Desktop Config

Behavior with Blocked Models

Installation

Using uvx (Recommended)

From PyPI

From GitHub Repository (Development)

From Local Directory (Development)

Transport Protocols

Using Docker

Installing via Smithery

Manual configuration of Claude Desktop

Support for other tools

Using with Cursor/Windsurf

Using with Claude Code

Setting up qdrant-llamaindex-mcp-server

Using Semantic Code Search in Claude Code

Run MCP server in Development Mode

Using with VS Code

Manual Installation

Contributing

Testing qdrant-llamaindex-mcp-server locally

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Read-Only Tools (Available with `QDRANT_READ_ONLY=true`)

Write Tools (Available when `QDRANT_READ_ONLY=false`)

Testing `qdrant-llamaindex-mcp-server` locally