Skip to main content

Knowledge Base MCP

Project description

Knowledge Base MCP

An MCP Server for creating Knowledge Bases and searching them!

100% Local, Fast, and Easy to Use

The MCP Server uses DuckDB by default which runs in-process and persists to disk. Everything from the vector store, to the embeddings, to the reranker, crawler, etc is all local. It does not require any cloud services or external dependencies.

Even more important, the Knowledge Base MCP Server is very fast at indexing, embedding, searching, and reranking compared to other solutions.

Quick Start

Just add the following to your MCP Server configuration!

"Knowledge Base": {
    "command": "uvx",
    "args": [
        "knowledge_base_mcp",
        "duckdb",
        "persistent",
        "run"
    ]
}

Features

Search Features

  • Knowledge Base Management: Organize your data into knowledge bases.
  • Semantic Search: Query across one or more knowledge bases using semantic search.
  • Multiple Vector Store Backends: Works with DuckDB and Elasticsearch.

Search results are automatically reranked using a reranker model to try to achieve the highest quality results.

Document Ingestion Features

  • Website Ingestion: Crawl websites into a knowledge base.
  • Git Repository Ingestion: Load and ingest a Git repository into a knowledge base.
  • Directory Ingestion: Load and ingest a directory into a knowledge base.

The Knowledge Base MCP Server uses advanced hierarchical + semantic chunking techniques to create embeddings for your data.

Usage

Command-Line Interface

The CLI supports both DuckDB (in-memory or persistent) and Elasticsearch as vector store backends.

DuckDB (Persistent)

To run the server with a persistent DuckDB store, which will save your knowledge bases to disk:

uv run knowledge_base_mcp duckdb persistent --docs-db-dir ./storage --docs-db-name documents.duckdb --code-db-dir ./storage --code-db-name code.duckdb run
  • --docs-db-dir: Directory to store document knowledge base. Defaults to ./storage.
  • --docs-db-name: Filename for the document knowledge base. Defaults to documents.duckdb.
  • --code-db-dir: Directory to store code knowledge base. Defaults to ./storage.
  • --code-db-name: Filename for the code knowledge base. Defaults to code.duckdb.

DuckDB (In-Memory)

To run the server with an in-memory DuckDB store (data will not be persisted after the server stops):

uv run knowledge_base_mcp duckdb memory run

Elasticsearch

To run the server with Elasticsearch as the backend, ensure you have an Elasticsearch instance running and accessible.

uv run knowledge_base_mcp elasticsearch --url http://localhost:9200 --index-docs-vectors kbmcp-docs-vectors --index-docs-kv kbmcp-docs-kv --index-code-vectors kbmcp-code-vectors --index-code-kv kbmcp-code-kv run

You can also set Elasticsearch options via environment variables:

  • ES_URL: Elasticsearch instance URL (e.g., http://localhost:9200).
  • ES_INDEX_DOCS_VECTORS: Index name for document vectors. Defaults to kbmcp-docs-vectors.
  • ES_INDEX_DOCS_KV: Index name for document key-value store. Defaults to kbmcp-docs-kv.
  • ES_INDEX_CODE_VECTORS: Index name for code vectors. Defaults to kbmcp-code-vectors.
  • ES_INDEX_CODE_KV: Index name for code key-value store. Defaults to kbmcp-code-kv.
  • ES_USERNAME: Username for Elasticsearch authentication.
  • ES_PASSWORD: Password for Elasticsearch authentication.
  • ES_API_KEY: API Key for Elasticsearch authentication.

Main Server Tools/Endpoints

When running, the MCP server exposes the following tools:

Ingestion Tools (from IngestServer)

  • load_website: Create a new knowledge base from a website by crawling seed URLs. If the knowledge base already exists, it will be replaced.
    • Parameters: knowledge_base (name for the new KB), seed_urls (list of URLs to start crawling), url_exclusions (optional list of URLs to exclude), max_pages (optional maximum number of pages to crawl).
  • load_directory: Create a new knowledge base from files within a local directory.
    • Parameters: knowledge_base (name for the new KB), path (path to the directory), exclude (optional file path globs to exclude), extensions (optional list of file extensions to include, defaults to Markdown and AsciiDoc), recursive (optional, whether to recursively gather files, defaults to True).
  • load_git_repository: Create a new knowledge base from a Git repository by cloning it and ingesting its contents.
    • Parameters: knowledge_base (name for the new KB), repository_url (URL of the Git repository), branch (branch to clone), path (path within the repository to ingest), exclude (optional file path globs to exclude), extensions (optional list of file extensions to include, defaults to Markdown and AsciiDoc).

Search Tools (from KnowledgeBaseSearchServer)

  • query: Search across one or more knowledge bases using a natural language question.
    • Parameters: query (the plain language query string), knowledge_bases (optional list of knowledge base names to restrict the search to; searches all if not provided).
  • get_document: Retrieve a specific document from a knowledge base by its title.
    • Parameters: knowledge_base (the name of the knowledge base), title (the title of the document).

Management Tools (from KnowledgeBaseManagementServer)

  • get_knowledge_bases: List all available knowledge bases and their document counts.
  • delete_knowledge_base: Remove a specific knowledge base by its name.
    • Parameters: knowledge_base (the name of the knowledge base to delete).
  • delete_all_knowledge_bases: Remove all knowledge bases from the vector store.
  • get_knowledge_base_stats: Get detailed statistics for a specific knowledge base.
    • Parameters: knowledge_base (the name of the knowledge base).

VS Code McpServer Usage

  1. Open the command palette (Ctrl+Shift+P or Cmd+Shift+P).
  2. Type "Settings" and select "Preferences: Open User Settings (JSON)".
  3. Add the following MCP Server configuration:
{
    "mcp": {
        "servers": {
            "Knowledge Base": {
                "command": "uvx",
                "args": [
                    "knowledge_base_mcp",
                    "duckdb",
                    "persistent",
                    "run"
                ]
            }
        }
    }
}

Roo Code

{
    "mcpServers": {
      "Knowledge Base": {
          "command": "uvx",
          "args": [
              "knowledge_base_mcp",
              "duckdb",
              "persistent",
              "run"
          ]
      }
    }
}

License

See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

knowledge_base_mcp-0.6.3.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

knowledge_base_mcp-0.6.3-py3-none-any.whl (81.8 kB view details)

Uploaded Python 3

File details

Details for the file knowledge_base_mcp-0.6.3.tar.gz.

File metadata

  • Download URL: knowledge_base_mcp-0.6.3.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.8.4

File hashes

Hashes for knowledge_base_mcp-0.6.3.tar.gz
Algorithm Hash digest
SHA256 d22866ccbe6aab50da2a8deb8d8ad40de00f2007485613919d7bb105b36fe41f
MD5 645e2cf092c9a67a009ff432f19513dc
BLAKE2b-256 73f80f9684a333e1eb4e9ead628f0865c11afcca6ae5bf9cbf8aa6a6947709fe

See more details on using hashes here.

File details

Details for the file knowledge_base_mcp-0.6.3-py3-none-any.whl.

File metadata

File hashes

Hashes for knowledge_base_mcp-0.6.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7ded5caefe5e0742a7143a20fdb7ee127d17ae07d26c7ff707fe70d1d1360730
MD5 40c433c76eb7ab014b04a02e7934bd79
BLAKE2b-256 6677fb665380641630f6842da61541b55d747983ab972263f858b154b99fcaa6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page