Skip to main content

Add your description here

Project description

MCP Server: Elasticsearch semantic search tool

Demo repo for: https://j.blaszyk.me/tech-blog/mcp-server-elasticsearch-semantic-search/

Table of Contents


Overview

This repository provides a Python implementation of an MCP server for semantic search through Search Labs blog posts indexed in Elasticsearch.

It assumes you've crawled the blog posts and stored them in the search-labs-posts index using Elastic Open Crawler.


Running the MCP Server

Add ES_URL and ES_AP_KEY into .env file, (take a look here for generating api key with minimum permissions)

Start the server in MCP Inspector:

make dev

Once running, access the MCP Inspector at: http://localhost:5173


Integrating with Claude Desktop

To add the MCP server to Claude Desktop:

make install-claude-config

This updates claude_desktop_config.json in your home directory. On the next restart, the Claude app will detect the server and load the declared tool.


Crawling Search Labs Blog Posts

1. Verify Crawler Setup

To check if the Elastic Open Crawler works, run:

docker run --rm \
  --entrypoint /bin/bash \
  -v "$(pwd)/crawler-config:/app/config" \
  --network host \
  docker.elastic.co/integrations/crawler:latest \
  -c "bin/crawler crawl config/test-crawler.yml"

This should print crawled content from a single page.


2. Configure Elasticsearch

Set up Elasticsearch URL and API Key.

Generate an API key with minimum crawler permissions:

POST /_security/api_key
{
  "name": "crawler-search-labs",
  "role_descriptors": {
    "crawler-search-labs-role": {
      "cluster": ["monitor"],
      "indices": [
        {
          "names": ["search-labs-posts"],
          "privileges": ["all"]
        }
      ]
    }
  },
  "metadata": {
    "application": "crawler"
  }
}

Copy the encoded value from the response and set it as API_KEY.


3. Update Index Mapping for Semantic Search

Ensure the search-labs-posts index exists. If not, create it:

PUT search-labs-posts

Update the mapping to enable semantic search:

PUT search-labs-posts/_mappings
{
  "properties": {
    "body": {
      "type": "text",
      "copy_to": "semantic_body"
    },
    "semantic_body": {
      "type": "semantic_text",
      "inference_id": ".elser-2-elasticsearch"
    }
  }
}

The body field is indexed as semantic text using Elasticsearch’s ELSER model.


4. Start Crawling

Run the crawler to populate the index:

docker run --rm \
  --entrypoint /bin/bash \
  -v "$(pwd)/crawler-config:/app/config" \
  --network host \
  docker.elastic.co/integrations/crawler:latest \
  -c "bin/crawler crawl config/elastic-search-labs-crawler.yml"

[!TIP] If using a fresh Elasticsearch cluster, wait for the ELSER model to start before indexing.


5. Verify Indexed Documents

Check if the documents were indexed:

GET search-labs-posts/_count

This will return the total document count in the index. You can also verify in Kibana.


Done! You can now perform semantic searches on Search Labs blog posts

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file mseep_elastic_semantic_search_mcp_server-0.1.0.tar.gz.

File metadata

File hashes

Hashes for mseep_elastic_semantic_search_mcp_server-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d9fdd5cf34146b62d1b9085579985e362d522a1b0e1d34032087ea75c2e31804
MD5 c2395965c4aed3752c26aa400ab493b5
BLAKE2b-256 acc3d15944f13f1e9a8bdf35e0c85134a521290160a3eed7ee26241dfcbd8ff7

See more details on using hashes here.

File details

Details for the file mseep_elastic_semantic_search_mcp_server-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mseep_elastic_semantic_search_mcp_server-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d865baa8cbad7d0ffebdb2f11fb4119fd5349c5f34fc5e84f4a399fd04136cf1
MD5 b52c50ae428cc344d636694ddf6aa96e
BLAKE2b-256 196bd7c3b641f2e7e84cc647497e2d524bde1b4eca15b1689a21fcc177f9dbc8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page