Skip to main content

OpenZIM MCP - ZIM MCP Server that enables AI models to access and search ZIM format knowledge bases offline

Project description

OpenZIM MCP Logo

OpenZIM MCP Server

Transform static ZIM archives into dynamic knowledge engines for AI models

CI codecov CodeQL Security Rating

PyPI version PyPI - Python Version PyPI - Downloads License: MIT


🆕 8-tool advanced surface. Phase F (v2.0.0) consolidated 22 advanced tools into 8 (zim_query, zim_search, zim_get, zim_get_section, zim_browse, zim_metadata, zim_links, zim_health). New in v2.1: native libzim archive validation via zim_health(zim_file_path=...), plus archive identity / index introspection in zim_metadata. Release notes → Docs →

OpenZIM MCP is a modern, secure, high-performance Model Context Protocol server that gives AI models structured, offline access to ZIM format knowledge archives — Wikipedia, Wiktionary, Stack Exchange, and the rest of the Kiwix Library.

Built for research assistants, knowledge chatbots, and content-analysis systems that need intelligent access to vast knowledge repositories — not just a raw text dump. Smart navigation by namespace (articles, metadata, media), structure-aware retrieval (sections, tables of contents, related articles), full-text search with suggestions and multi-archive search_all, and link-graph extraction to map content relationships. Cached, paginated operations keep things responsive across massive archives; comprehensive input validation and path-traversal protection keep things safe.

Streamable HTTP transport, per-entry MCP resources with subscriptions, and dual Simple / Advanced modes ship in v2.0.0.

Install

# uv (recommended — isolated CLI tool)
uv tool install openzim-mcp

# pip
pip install openzim-mcp

# Docker (multi-arch image, ghcr.io)
docker pull ghcr.io/cameronrye/openzim-mcp:2.1.1
docker run --rm -v /path/to/zim/files:/zim ghcr.io/cameronrye/openzim-mcp:2.1.1 /zim

Verify the install:

openzim-mcp --help

Download ZIM files from the Kiwix Library into a directory of your choice before running the server.

Quick start

Run the server in Simple mode (default — exposes one natural-language tool, zim_query):

openzim-mcp /path/to/zim/files

Wire it into your MCP client. Example for Claude Desktop's claude_desktop_config.json (any MCP client that speaks stdio works the same way):

{
  "mcpServers": {
    "openzim-mcp": {
      "command": "openzim-mcp",
      "args": ["/path/to/zim/files"]
    }
  }
}

Once the client connects, ask your LLM: "summarize the article on Photosynthesis"zim_query dispatches to the right underlying tool automatically.

For full control, run in Advanced mode to expose all 8 specialized tools:

{
  "mcpServers": {
    "openzim-mcp-advanced": {
      "command": "openzim-mcp",
      "args": ["--mode", "advanced", "/path/to/zim/files"]
    }
  }
}

For HTTP transport (long-running service with bearer auth, CORS, and health endpoints) see HTTP & Docker deployment.

Highlights

  • 8-tool advanced surfacezim_query, zim_search, zim_get, zim_get_section, zim_browse, zim_metadata, zim_links, zim_health. Down from 22; advanced-mode schema drops from ~36KB to ~23.5KB, clearing the MCP Tax pain band. API reference →
  • Streamable HTTP transport — bearer-token auth, CORS, health endpoints, multi-arch Docker image. HTTP & Docker deployment →
  • Per-entry MCP resources + subscriptionszim://{name}/entry/{path} with native MIME types; clients subscribe and receive notifications/resources/updated when archives change. Resources, prompts & subscriptions →
  • Simple-mode zim_query — one natural-language tool that dispatches to the right operation, tuned for small-model deployment targets. Quick start →
  • Native libzim introspection (v2.1)zim_health(zim_file_path=...) validates an archive's integrity (Archive.check() + checksum), and zim_metadata now reports archive identity, full-text / title index capabilities, and an M/Counter mimetype breakdown. API reference →

Modes

OpenZIM MCP ships two modes; pick one per client.

Simple mode (default) exposes a single intelligent tool, zim_query, that parses natural-language requests and dispatches to the right underlying operation. Built for small-model deployment targets — the wire footprint is minimal and the dispatch happens server-side, not in the LLM context. Start here unless you have a specific reason not to.

Advanced mode exposes all 8 specialized tools (zim_query, zim_search, zim_get, zim_get_section, zim_browse, zim_metadata, zim_links, zim_health) plus 3 MCP prompts (/research, /summarize, /explore) and per-entry resources. Built for larger models that can reliably dispatch over the full schema, and for clients that want fine-grained control over pagination, namespace browsing, and link-graph extraction.

Rule of thumb: models ≤ 13B parameters benefit from Simple mode; larger models (Claude Sonnet/Opus, GPT-4o-class, Llama 70B+) can dispatch Advanced mode directly. See LLM integration patterns for guidance on choosing.

Documentation

Full documentation lives at https://cameronrye.github.io/openzim-mcp/docs/.

Group Pages
Get started Introduction · Installation · Quick start
Reference API reference · Configuration · Resources, prompts & subscriptions
Guides LLM integration patterns · Smart retrieval · HTTP & Docker deployment · Performance optimization · Security best practices · Worked examples
Operations Troubleshooting · FAQ · Architecture overview

Project status

v2.0.0 GA shipped 2026-05-27. v1.x is in maintenance mode — security fixes, data-corruption fixes, and pre-v2.0.0 crash fixes accepted through 2026-11-27 or until v2.5.0 ships, whichever comes first. Full release history: CHANGELOG.md.

Contributing

See CONTRIBUTING.md for development setup, test commands, code style, and the release process.

Security

See SECURITY.md for the vulnerability disclosure policy. No known CVEs.

License

MIT. See LICENSE.

Acknowledgments


Made with ❤️ by Cameron Rye

Project details


Release history Release notifications | RSS feed

This version

2.1.3

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openzim_mcp-2.1.3.tar.gz (855.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openzim_mcp-2.1.3-py3-none-any.whl (434.6 kB view details)

Uploaded Python 3

File details

Details for the file openzim_mcp-2.1.3.tar.gz.

File metadata

  • Download URL: openzim_mcp-2.1.3.tar.gz
  • Upload date:
  • Size: 855.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for openzim_mcp-2.1.3.tar.gz
Algorithm Hash digest
SHA256 7a2471ed28c6c524c70ceadc7a02369aa42efc0ac30ecc9b078ab2684ecac6bc
MD5 2753206447918d1ce9599ed48837bde3
BLAKE2b-256 247247988cc36fe8516fd7d53adbd259118bc8e16b9c745c56896f94fd9ed8d3

See more details on using hashes here.

Provenance

The following attestation bundles were made for openzim_mcp-2.1.3.tar.gz:

Publisher: release.yml on cameronrye/openzim-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file openzim_mcp-2.1.3-py3-none-any.whl.

File metadata

  • Download URL: openzim_mcp-2.1.3-py3-none-any.whl
  • Upload date:
  • Size: 434.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for openzim_mcp-2.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a2fbef350a09c73342a477a641b5f64d53c8deab59c330a82704c732eaf4a014
MD5 20f418ba4e854a4fcbdb6535f125a2d1
BLAKE2b-256 1f1dca17c47f3264dc1417bedc2f4f0d67cf550baa0db2d93f62ae726f3f1f89

See more details on using hashes here.

Provenance

The following attestation bundles were made for openzim_mcp-2.1.3-py3-none-any.whl:

Publisher: release.yml on cameronrye/openzim-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page