Skip to main content

Knowledge Base MCP

Project description

Knowledge Base MCP

An MCP Server for creating Knowledge Bases and searching them!

100% Local, Fast, and Easy to Use

The MCP Server uses DuckDB by default which runs in-process and persists to disk. Everything from the vector store, to the embeddings, to the reranker, crawler, etc is all local. It does not require any cloud services or external dependencies.

Even more important, the Knowledge Base MCP Server is very fast at indexing, embedding, searching, and reranking compared to other solutions.

Quick Start

Just add the following to your MCP Server configuration!

"Knowledge Base": {
    "command": "uvx",
    "args": [
        "knowledge_base_mcp",
        "duckdb",
        "persistent",
        "run"
    ]
}

Features

Search Features

  • Knowledge Base Management: Organize your data into knowledge bases.
  • Semantic Search: Query across one or more knowledge bases using semantic search.
  • Multiple Vector Store Backends: Works with DuckDB and Elasticsearch.

Search results are automatically reranked using a reranker model to try to achieve the highest quality results.

Document Ingestion Features

  • Website Ingestion: Crawl websites into a knowledge base.
  • Git Repository Ingestion: Load and ingest a Git repository into a knowledge base.
  • Directory Ingestion: Load and ingest a directory into a knowledge base.

The Knowledge Base MCP Server uses advanced hierarchical + semantic chunking techniques to create embeddings for your data.

Usage

Command-Line Interface

The CLI supports both DuckDB (in-memory or persistent) and Elasticsearch as vector store backends.

DuckDB (Persistent)

To run the server with a persistent DuckDB store, which will save your knowledge bases to disk:

uv run knowledge_base_mcp duckdb persistent --docs-db-dir ./storage --docs-db-name documents.duckdb --code-db-dir ./storage --code-db-name code.duckdb run
  • --docs-db-dir: Directory to store document knowledge base. Defaults to ./storage.
  • --docs-db-name: Filename for the document knowledge base. Defaults to documents.duckdb.
  • --code-db-dir: Directory to store code knowledge base. Defaults to ./storage.
  • --code-db-name: Filename for the code knowledge base. Defaults to code.duckdb.

DuckDB (In-Memory)

To run the server with an in-memory DuckDB store (data will not be persisted after the server stops):

uv run knowledge_base_mcp duckdb memory run

Elasticsearch

To run the server with Elasticsearch as the backend, ensure you have an Elasticsearch instance running and accessible.

uv run knowledge_base_mcp elasticsearch --url http://localhost:9200 --index-docs-vectors kbmcp-docs-vectors --index-docs-kv kbmcp-docs-kv --index-code-vectors kbmcp-code-vectors --index-code-kv kbmcp-code-kv run

You can also set Elasticsearch options via environment variables:

  • ES_URL: Elasticsearch instance URL (e.g., http://localhost:9200).
  • ES_INDEX_DOCS_VECTORS: Index name for document vectors. Defaults to kbmcp-docs-vectors.
  • ES_INDEX_DOCS_KV: Index name for document key-value store. Defaults to kbmcp-docs-kv.
  • ES_INDEX_CODE_VECTORS: Index name for code vectors. Defaults to kbmcp-code-vectors.
  • ES_INDEX_CODE_KV: Index name for code key-value store. Defaults to kbmcp-code-kv.
  • ES_USERNAME: Username for Elasticsearch authentication.
  • ES_PASSWORD: Password for Elasticsearch authentication.
  • ES_API_KEY: API Key for Elasticsearch authentication.

Main Server Tools/Endpoints

When running, the MCP server exposes the following tools:

Ingestion Tools (from IngestServer)

  • load_website: Create a new knowledge base from a website by crawling seed URLs. If the knowledge base already exists, it will be replaced.
    • Parameters: knowledge_base (name for the new KB), seed_urls (list of URLs to start crawling), url_exclusions (optional list of URLs to exclude), max_pages (optional maximum number of pages to crawl).
  • load_directory: Create a new knowledge base from files within a local directory.
    • Parameters: knowledge_base (name for the new KB), path (path to the directory), exclude (optional file path globs to exclude), extensions (optional list of file extensions to include, defaults to Markdown and AsciiDoc), recursive (optional, whether to recursively gather files, defaults to True).
  • load_git_repository: Create a new knowledge base from a Git repository by cloning it and ingesting its contents.
    • Parameters: knowledge_base (name for the new KB), repository_url (URL of the Git repository), branch (branch to clone), path (path within the repository to ingest), exclude (optional file path globs to exclude), extensions (optional list of file extensions to include, defaults to Markdown and AsciiDoc).

Search Tools (from KnowledgeBaseSearchServer)

  • query: Search across one or more knowledge bases using a natural language question.
    • Parameters: query (the plain language query string), knowledge_bases (optional list of knowledge base names to restrict the search to; searches all if not provided).
  • get_document: Retrieve a specific document from a knowledge base by its title.
    • Parameters: knowledge_base (the name of the knowledge base), title (the title of the document).

Management Tools (from KnowledgeBaseManagementServer)

  • get_knowledge_bases: List all available knowledge bases and their document counts.
  • delete_knowledge_base: Remove a specific knowledge base by its name.
    • Parameters: knowledge_base (the name of the knowledge base to delete).
  • delete_all_knowledge_bases: Remove all knowledge bases from the vector store.
  • get_knowledge_base_stats: Get detailed statistics for a specific knowledge base.
    • Parameters: knowledge_base (the name of the knowledge base).

VS Code McpServer Usage

  1. Open the command palette (Ctrl+Shift+P or Cmd+Shift+P).
  2. Type "Settings" and select "Preferences: Open User Settings (JSON)".
  3. Add the following MCP Server configuration:
{
    "mcp": {
        "servers": {
            "Knowledge Base": {
                "command": "uvx",
                "args": [
                    "knowledge_base_mcp",
                    "duckdb",
                    "persistent",
                    "run"
                ]
            }
        }
    }
}

Roo Code

{
    "mcpServers": {
      "Knowledge Base": {
          "command": "uvx",
          "args": [
              "knowledge_base_mcp",
              "duckdb",
              "persistent",
              "run"
          ]
      }
    }
}

License

See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

knowledge_base_mcp-0.6.2.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

knowledge_base_mcp-0.6.2-py3-none-any.whl (81.8 kB view details)

Uploaded Python 3

File details

Details for the file knowledge_base_mcp-0.6.2.tar.gz.

File metadata

  • Download URL: knowledge_base_mcp-0.6.2.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.8.4

File hashes

Hashes for knowledge_base_mcp-0.6.2.tar.gz
Algorithm Hash digest
SHA256 4c75f868736c17c9d52c1805f034916c4e86d3b9c6b51ef8b7e22f967238fc27
MD5 92aaf965b527bf10c56eaf41977919cf
BLAKE2b-256 a7338e419468cd0b1b5f8dd6c5bbb19fdf8d21b5dd6b8007f1906b75e6e2f1b4

See more details on using hashes here.

File details

Details for the file knowledge_base_mcp-0.6.2-py3-none-any.whl.

File metadata

File hashes

Hashes for knowledge_base_mcp-0.6.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e8976aa9e28b8d6b9c5beba3ea38d4f83c066bfc4d329e256f05d733da33e120
MD5 673ab968baca9f0e14f8d1c1a2f1fa59
BLAKE2b-256 f0e35968b6fbfecb00348b08b1e489c8ab0cf88461f323579fdeb44e76e89d4b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page