Skip to main content

An MCP server for crawling documentation, writing it to Elasticsearch for vector search retrieval by an LLM.

Project description

Elasticsearch Knowledge Base MCP Server

Overview

This MCP server empowers your AI Assistant to ASK, LEARN, and REMEMBER:

  • ASK: Ask questions of the gathered knowledge bases, in plain language like, "What's the best way to use local_example in ruby Rspec tests?".
  • LEARN: Obtain and index entire documentation stores (e.x. every word of every page of https://docs.pytest.org/en/stable/contents.html) from the Web, git repositories, or the local filesystem.
  • REMEMBER: Store working information, user preferences, and rules as "memories" for future reference.

This MCP Server is powered by Elasticsearch Serverless Search (Start a free trial) for inference, and vector search, and Elastic Crawler for crawling, parsing, and indexing.

Benefits

This MCP Server significantly reduces token usage of the AI Assistant by allowing it to reference specific documentation for the task at hand instead of relying on the AI model's internal knowledge. This allows the AI Assistant to one-shot complex tasks because it doesn't need to guess parameter names, types, or usage. It also allows the AI Assistant to reference documentation as needed without needing to be trained on it.

Demo

Searching Documentation

See how you can autonomously search documentation stored in a knowledge base to gather details needed for a task:

https://github.com/user-attachments/assets/64b5fee1-a983-4a92-9485-bfc54f879374

Crawling Documentation

Watch how you can identify project dependencies and automatically crawl relevant web documentation to build a knowledge base:

https://github.com/user-attachments/assets/c7226aa9-9b40-45fb-877b-8721550e0576

Configuration

To use this server, the MCP host (e.g., Roo VS Code extension, Cline, VS Code) must be configured with the connection details for the target Elasticsearch cluster, including the host URL and authentication credentials (like an API Key).

This server requires connection details for your Elasticsearch cluster and is configured directly within your MCP host's settings file (e.g., mcp_settings.json for the Roo VS Code extension).

The recommended way to run this server is using uvx, which handles fetching and running the code directly from GitHub.

VS Code

  1. Open the command palette (Ctrl+Shift+P or Cmd+Shift+P).
  2. Type "Settings" and select "Preferences: Open User Settings (JSON)".
  3. Add the following MCP Server configuration
{
    "mcp": {
        "inputs": [

            {
                "type": "promptString",
                "id": "es-host",
                "description": "Elasticsearch Host",
                "password": false
            },
            {
                "type": "promptString",
                "id": "es-api-key",
                "description": "Elasticsearch API Key",
                "password": true
            }
        ],
        "servers": {
            "es_knowledge_base_mcp": {
                "command": "uvx",
                "args": [
                    "git+https://github.com/strawgate/es-knowledge-base-mcp"
                ],
                "env": {
                    "ES_HOST": "${input:es-host}",
                    "ES_API_KEY": "${input:es-api-key}",
                },
            }
        }
    }
}

Cline / Roo Code

Add the following configuration block to your mcpServers object:

  "Knowledge Base": {
      "command": "uvx",
      "args": [
        "git+https://github.com/strawgate/es-knowledge-base-mcp"
      ],
      "env": {
        "ES_HOST": "https://YOUR_ELASTICSEARCH_HOST_URL:443",
        // --- Authentication: Provide EITHER API Key
        "ES_API_KEY": "YOUR_BASE64_ENCODED_API_KEY",
        // OR Username/Password
        "ES_USERNAME": "YOUR_ELASTICSEARCH_USERNAME",
        "ES_PASSWORD": "YOUR_ELASTICSEARCH_PASSWORD",
      },
      "alwaysAllow": [],
      "disabled": false
    }

Available Tools

The es_knowledge_base_mcp_debug server provides the following tools:

Knowledge Base Management

  • knowledge_base_create: Create a new knowledge base.
  • knowledge_base_get: Get a list of all knowledge bases.
  • knowledge_base_get_by_backend_id: Get a knowledge base by its backend ID.
  • knowledge_base_get_by_name: Get a knowledge base by its name.
  • knowledge_base_delete_by_backend_id: Delete a knowledge base by its backend ID.
  • knowledge_base_delete_by_name: Delete a knowledge base by its name.
  • knowledge_base_update_by_backend_id: Update the metadata of an existing knowledge base by its backend ID.
  • knowledge_base_update_by_name: Update the description of an existing knowledge base by its name.

Memory

  • memory_encodings: Encode multiple memories into the memory knowledge base.
  • memory_encoding: Encode a single memory into the memory knowledge base.
  • memory_recall: Search the memory knowledge base using questions.
  • memory_recall_last: Retrieve the most recent memories from the memory knowledge base.

Ask

  • ask_questions: Ask questions of the knowledge base.
  • ask_questions_for_kb: Ask questions of a specific knowledge base.

Learn

  • learn_extract_urls_from_webpage: Extracts all unique URLs from a given webpage.
  • learn_from_web_documentation: Starts a crawl job based on a seed page and creates a knowledge base entry for it.
  • learn_active_documentation_requests: Returns a list of active documentation requests.

Fetch

  • fetch_webpage: Fetches a webpage and converts it to Markdown format.

Bulk Operations

  • call_tool_bulk: Call a single tool multiple times in a single request.
  • call_tools_bulk: Call multiple tools in a single request.

Resources

  • kb://entry: Access the details (Title, Source, Description) of a specific knowledge base entry using its unique ID or assigned name.

Contributing

For details on local development, setup, and contributing to this project, please see the Contributing Guide.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

es_knowledge_base_mcp-0.1.0.tar.gz (29.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

es_knowledge_base_mcp-0.1.0-py3-none-any.whl (37.9 kB view details)

Uploaded Python 3

File details

Details for the file es_knowledge_base_mcp-0.1.0.tar.gz.

File metadata

File hashes

Hashes for es_knowledge_base_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c6341eca876a9f9b11d91f311e9c418ce9131f43b6ab3fb25933361780bd859d
MD5 2c0a6ad2a3df4209119ac8463b1678a8
BLAKE2b-256 9c696be2fa69b3f60683bcc6a9a733fbaa62e09621adb44a6e719095c4a4c77f

See more details on using hashes here.

File details

Details for the file es_knowledge_base_mcp-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for es_knowledge_base_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4a94e4332911326a28e80558fa488b48023e12929befb689c72326526766d296
MD5 aa884380dafa639b34c2a0916a3a8b02
BLAKE2b-256 2eb617f44dcf2cf2ae209ec673bbb51a08db34c6a62d4b61c79a7d2d659c190a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page