Skip to main content

Progressive-retrieval MCP server for Wikipedia

Project description

mcp-server-wikipedia 📚

PyPI version License: MIT CI

This project exposes Wikipedia as an MCP server using a Progressive Retrieval Strategy. It is designed to minimize token usage by allowing LLMs to "scout" information before fetching large bodies of text.

The Problem: Token Waste

Wikipedia integrations often fetch multiple full pages up front, then decide what mattered. This fills the context window with irrelevant data and increases latency and cost.

The Solution: The Librarian Philosophy

This server implements a "Progressive Retrieval Ladder." Like a librarian helping you find a specific book, it encourages the model to:

  1. Search for several candidate titles.
  2. Summarize the candidates to find the right one.
  3. Inspect the TOC to find the relevant section.
  4. Fetch only the specific section OR the full page only if necessary.
graph TD
    A[Search Articles] --> B[Get Summaries]
    B --> C{Correct Page?}
    C -- No --> A
    C -- Yes --> D[Get TOC]
    D --> E[Get Section / Page]

Tools

  • search_articles(query, limit=5): Top matching pages with snippets.
  • get_summaries(titles): Compact summaries for multiple candidate pages.
  • get_toc(title): Table of contents / section map for a page.
  • get_section(title, section): Retrieve a single section by index or title.
  • get_page(title): Retrieve the full plain-text page.

Token Efficiency Benchmark

In deterministic testing, this progressive strategy achieves up to 80% token reduction compared to naive full-page retrieval. Detailed results can be found in BENCHMARK.md.

Strategy Token Usage (Avg)
Naive (Full Page) ~100%
MCP (Progressive) ~20%

Quick Start

Installation

From PyPI:

pip install mcp-server-wikipedia

Or run it directly via npx (if using the JS wrapper) or the python entry point:

python -m mcp_server_wikipedia

For development:

git clone https://github.com/surendranb/wikipedia-mcp-server.git
cd wikipedia-mcp-server
python3 -m venv .venv
source .venv/bin/source
pip install -e .

Run

wikipedia-mcp-server

MCP Client Configuration

Claude Desktop

Add this to your claude_desktop_config.json:

{
  "mcpServers": {
    "wikipedia": {
      "command": "wikipedia-mcp-server"
    }
  }
}

Cursor / VS Code

Specify the wikipedia-mcp-server command in your MCP settings.

Example Prompts

  • "Search for 'photosynthesis light dependent reactions' and summarize the top 3 candidates."
  • "What molecules are produced during the light-dependent reactions of photosynthesis? Search first, then fetch only the relevant section."

Development

Run tests:

python -m unittest discover -s tests -p "test_*.py" -v

Run benchmarks:

pip install -e ".[benchmark]"
python scripts/benchmark_token_efficiency.py

Contributing

We value simplicity and surgical efficiency. If you have an improvement that maintains the single-file architecture and enhances retrieval precision, we welcome your input. See CONTRIBUTING.md.

License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_server_wikipedia-0.1.0.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_server_wikipedia-0.1.0-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file mcp_server_wikipedia-0.1.0.tar.gz.

File metadata

  • Download URL: mcp_server_wikipedia-0.1.0.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mcp_server_wikipedia-0.1.0.tar.gz
Algorithm Hash digest
SHA256 665cac36b6165c0c5303a485e28f8449aa0a3e6dd92357cd1a85cba4b75f8921
MD5 6b9347454715d968348e9525d07d2e0d
BLAKE2b-256 d8b7cfe4660cc677c26206be23f3ec20b36e993a77af6177a8e13c67b6f1d8c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_server_wikipedia-0.1.0.tar.gz:

Publisher: publish.yml on surendranb/wikipedia-mcp-server

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mcp_server_wikipedia-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mcp_server_wikipedia-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 23109bde2af51972f6a8e472605a056c7f8bb55017f49b7173296862763bd03d
MD5 0420055b7f1e75fca880e7ef2c4ea8e8
BLAKE2b-256 e8147625bb0b764098ad0fea3c11407edfcc3f4e2237669caea127084e22cb58

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_server_wikipedia-0.1.0-py3-none-any.whl:

Publisher: publish.yml on surendranb/wikipedia-mcp-server

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page