Skip to main content

MCP server providing offline search capabilities through ZIM files

Project description

MCP ZIM Server

An MCP (Model Context Protocol) server that provides offline search and content extraction capabilities for Large Language Models (LLMs) using ZIM files. This server allows LLMs to perform deep research and access information in offline environments, replacing the need for live web access.

Features

  • Offline Search: Full-text search across millions of articles within ZIM files.
  • Content Extraction: Extract and format content from ZIM entries in various formats (text, HTML).
  • ZIM File Discovery: Automatically discover ZIM files in a specified directory.
  • Caching: In-memory caching for archives, search results, and file info to improve performance.
  • Configurable: Easily configurable through environment variables.

Requirements

  • Python 3.10+
  • pip or uv for package installation

Getting Started

Run the Python package as a CLI command using uv:

uvx zim-mcp # see --help for more options

Build/Install from GitHub

  1. Clone the repository or download the source code:

    git clone https://github.com/mobilemutex/zim-mcp.git
    cd zim-mcp
    
  2. Run with uv:

    uv run zim-mcp
    

    or with pipx:

    pipx install .
    zim-mcp
    

    or with pip and venv:

    python3 -m venv .venv
    source .venv/bin/activate
    pip install .
    zim-mcp
    

Configuration

The server can be configured using the following environment variables:

  • ZIM_FILES_DIRECTORY: The directory where your ZIM files are stored. (Default: ./zim_files)
  • MAX_SEARCH_RESULTS: The maximum number of search results to return per query. (Default: 100)
  • SEARCH_TIMEOUT: The timeout for search operations in seconds. (Default: 30)
  • LOG_LEVEL: The logging level for the server. (Default: INFO)

Tools

list_zim_files

Lists all available ZIM files in the configured directory.

  • list_zim_files()
    • Parameters: None
    • Returns: A dictionary containing a list of ZIM files with their metadata.

get_zim_metadata

Gets detailed metadata about a specific ZIM file.

  • get_zim_metadata(zim_file: str)
    • Parameters:
      • zim_file (str): The name of the ZIM file.
    • Returns: A dictionary containing detailed metadata for the specified ZIM file.

search_zim_files

Searches for content across one or multiple ZIM files.

  • search_zim_files(query: str, zim_files: Optional[List[str]], max_results: int, start_offset: int)
    • Parameters:
      • query (str): The search query.
      • zim_files (Optional[List[str]]): A list of ZIM files to search. If not provided, all files are searched.
      • max_results (int): The maximum number of results to return. (Default: 20)
      • start_offset (int): The pagination offset. (Default: 0)
    • Returns: A dictionary containing the search results.

read_zim_entry

Reads the content of a specific entry from a ZIM file.

  • read_zim_entry(zim_file: str, entry_path: str, format: str)
    • Parameters:
      • zim_file (str): The name of the ZIM file.
      • entry_path (str): The path to the entry.
      • format (str): The output format (text, html, raw). (Default: text)
    • Returns: A dictionary containing the entry's content.

search_and_extract_content

Performs a search and returns the full content of the matching entries.

  • search_and_extract_content(query: str, ...)
    • Parameters: Similar to search_zim_files, with additional content formatting options.
    • Returns: A dictionary containing the search results with their full content.

browse_zim_entries

Browses entries by path or title patterns.

  • browse_zim_entries(zim_file: str, ...)
    • Parameters:
      • zim_file (str): The ZIM file to browse.
      • path_pattern (Optional[str]): A pattern to match against entry paths.
      • title_pattern (Optional[str]): A pattern to match against entry titles.
      • limit (int): The maximum number of entries to return.
    • Returns: A list of matching entries.

get_random_entries

Gets a specified number of random entries from ZIM files.

  • get_random_entries(zim_files: Optional[List[str]], count: int)
    • Parameters:
      • zim_files (Optional[List[str]]): A list of ZIM files to get entries from.
      • count (int): The number of random entries to return.
    • Returns: A list of random entries.

Resource Endpoints

The server also exposes the following resource endpoints:

  • zim://files: Lists all available ZIM files.
  • zim://file/{filename}/metadata: Provides metadata for a specific ZIM file.
  • zim://file/{filename}/entry/{path}: Provides the content of a specific entry.

Usage

This Python package is published to PyPI as zim-mcp and can be installed and run with pip, pipx, uv, poetry, or any Python package manager.

$ pipx install zim-mcp
$ zim-mcp --help

usage: zim-mcp [-h] [--transport {stdio,streamable-http,sse}] [--port PORT]

MCP ZIM Server

options:
  -h, --help            show this help message and exit
  --transport {stdio,streamable-http,sse}
                        Transport type (default: stdio)
  --port PORT           Port for SSE transport (default: 8000)

Running the Server

  1. Place your ZIM files in the directory specified by the ZIM_FILES_DIRECTORY environment variable (or the default ./zim_files directory).

  2. Run the server using one of the supported transports:

    • Standard I/O (stdio):

      zim-mcp --transport stdio
      
    • Server-Sent Events (SSE):

      zim-mcp --transport sse --port 8000
      
    • Server-Sent Events (SSE):

      zim-mcp --transport streamable-http
      

Using with OpenWeb-UI and MCPO

You can integrate zim-mcp with OpenWeb-UI using MCPO, an MCP-to-OpenAPI proxy. This allows you to expose zim-mcp's tools through a standard RESTful API, making them accessible to web interfaces and other tools.

With uvx

You can run zim-mcp and mcpo together using uvx:

uvx mcpo -- zim-mcp

Standard Input/Output (stdio)

The stdio transport enables communication through standard input and output streams. This is particularly useful for local integrations and command-line tools. See the spec for more details.

Python

zim-mcp

By default, the Python package will run in stdio mode. Because it's using the standard input and output streams, it will look like the tool is hanging without any output, but this is expected.

Streamable HTTP

Streamable HTTP enables streaming responses over JSON RPC via HTTP POST requests. See the spec for more details.

By default, the server listens on 127.0.0.1:8000/mcp for client connections. To change any of this, set FASTMCP_* environment variables. The server must be running for clients to connect to it.

Python

zim-mcp -t streamable-http

By default, the Python package will run in stdio mode, so you will have to include -t streamable-http.

Server-sent events (SSE)

[!WARNING] The MCP communiity considers this a legacy transport portcol and is really intended for backwards compatibility. Streamable HTTP is the recommended replacement.

SSE transport enables server-to-client streaming with Server-Send Events for client-to-server and server-to-client communication. See the spec for more details.

By default, the server listens on 127.0.0.1:8000/sse for client connections. To change any of this, set FASTMCP_* environment variables. The server must be running for clients to connect to it.

Python

zim-mcp -t sse

By default, the Python package will run in stdio mode, so you will have to include -t sse.

Integrations

Claude Desktop, Roo Code, etc.

Add the following JSON block to your claude_desktop_config.json or mcp.json file:

{
  "mcpServers": {
    "zim-mcp": {
      "command": "uvx",
      "args": ["zim_mcp", "--transport", "stdio"],
      "env": {
        "LOG_LEVEL": "INFO",
        "ZIM_FILES_DIRECTORY": "~/zim_files"
      }
    }
  }
}

Development

Contributions are welcome! If you want to contribute to the development of the MCP ZIM Server, please follow these steps:

  1. Fork the repository.
  2. Create a new branch for your feature or bug fix.
  3. Make your changes and write tests.
  4. Submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zim_mcp-0.1.0.tar.gz (55.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zim_mcp-0.1.0-py3-none-any.whl (23.2 kB view details)

Uploaded Python 3

File details

Details for the file zim_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: zim_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 55.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.13

File hashes

Hashes for zim_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 05acc0ddeb46f892ba90ce5f44fca64eb3e4d02a960531f71be855c73dfcc6bb
MD5 a2c780146ecd2d6f379135dcab4e2ee3
BLAKE2b-256 72d9e548999078ca43099e0f71042c8b07052e8afc7bbd5f6bb38d38db6d7f55

See more details on using hashes here.

File details

Details for the file zim_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: zim_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 23.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.13

File hashes

Hashes for zim_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 432a71d4a4c422311adcbf0d98219a81f03568023342475e8d25a7ba4737c269
MD5 2d0fddba4431e08dbe001e39dbc3d473
BLAKE2b-256 1740906716691f68ce4901d2611353b899af7317775205bf2139b3c5ccbbf91e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page