Skip to main content

Context-aware web fetching MCP server that respects token limits

Project description

Smart WebFetch MCP Server

PyPI version PyPI downloads Python version License: MIT

Context-aware web fetching for LLMs. Prevents context window flooding by checking page size before fetching and providing surgical extraction tools.

The Problem

Standard web fetch tools dump entire pages into the context window, often:

  • Exceeding token limits
  • Wasting context on navigation, footers, ads
  • Flooding the model with irrelevant content

The Solution

Smart WebFetch provides 5 tools for intelligent web fetching:

Tool Purpose
web_preflight Check page size before fetching
web_smart_fetch Fetch with automatic truncation
web_fetch_code Extract only code blocks
web_fetch_section Fetch specific heading/section
web_fetch_chunked Paginated fetching for large docs

Installation

# Install from PyPI
pip install smart-webfetch-mcp

# Or with uvx (recommended for MCP)
uvx smart-webfetch-mcp

Configuration

OpenCode

Add to your opencode.json:

{
  "mcp": {
    "smart-webfetch": {
      "type": "local",
      "command": ["uvx", "smart-webfetch-mcp"],
      "enabled": true
    }
  }
}

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "smart-webfetch": {
      "command": "uvx",
      "args": ["smart-webfetch-mcp"]
    }
  }
}

Usage Examples

Check before fetching

Use web_preflight to check https://docs.python.org/3/library/asyncio.html

Response:

{
  "url": "https://docs.python.org/3/library/asyncio.html",
  "estimated_tokens": 45000,
  "safe_for_context": false,
  "recommendation": "Very large page (~45,000 tokens). Use web_fetch_section or web_fetch_chunked."
}

Fetch with automatic truncation

Use web_smart_fetch on https://example.com/docs with max_tokens=4000

Extract only code examples

Use web_fetch_code on https://docs.python.org/3/library/asyncio-task.html

Get specific section

Use web_fetch_section on https://docs.python.org/3/library/asyncio.html 
with heading="Running an asyncio Program"

Paginated reading

Use web_fetch_chunked on https://large-docs.com/api with chunk=0, chunk_size=4000

Then continue with chunk=1, chunk=2, etc.

Tool Reference

web_preflight

Check page metadata before fetching.

Parameters:

  • url (required): URL to check

Returns:

  • estimated_tokens: Approximate token count
  • content_type: MIME type
  • is_html: Whether content is HTML
  • title: Page title (if HTML)
  • safe_for_context: Boolean (true if < 8000 tokens)
  • recommendation: Human-readable advice

web_smart_fetch

Fetch with automatic truncation for large pages.

Parameters:

  • url (required): URL to fetch
  • max_tokens (optional, default 8000): Maximum tokens to return
  • strategy (optional, default "auto"): "auto" finds natural break points, "truncate" hard cuts

Returns: Markdown content with metadata header

web_fetch_code

Extract only code blocks from a page.

Parameters:

  • url (required): URL to extract code from

Returns: Code blocks with language annotations and context

web_fetch_section

Fetch content under a specific heading.

Parameters:

  • url (required): URL to fetch from
  • heading (required): Heading text to find (case-insensitive)

Returns: Section content or list of available sections if not found

web_fetch_chunked

Fetch large documents in chunks.

Parameters:

  • url (required): URL to fetch
  • chunk (optional, default 0): Chunk index (0-based)
  • chunk_size (optional, default 4000): Tokens per chunk

Returns: Chunk content with navigation metadata

Development

# Clone and install dev dependencies
git clone https://github.com/mathisto/smart-webfetch-mcp
cd smart-webfetch-mcp
pip install -e ".[dev]"

# Run tests
pytest

# Format code
ruff format .
ruff check --fix .

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smart_webfetch_mcp-0.2.0.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smart_webfetch_mcp-0.2.0-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file smart_webfetch_mcp-0.2.0.tar.gz.

File metadata

  • Download URL: smart_webfetch_mcp-0.2.0.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for smart_webfetch_mcp-0.2.0.tar.gz
Algorithm Hash digest
SHA256 d496dc96443b3a59ee31424ee5facd80c9077b7835c34313919ed05684c8cf35
MD5 1ffd83fc92b78be84682319538c10a11
BLAKE2b-256 2315696a1c0b3882a4ffdd2eba61ca5bc8085dcc4c10c29326439b3384e977ef

See more details on using hashes here.

Provenance

The following attestation bundles were made for smart_webfetch_mcp-0.2.0.tar.gz:

Publisher: publish.yml on mathisto/smart-webfetch-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smart_webfetch_mcp-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for smart_webfetch_mcp-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9d2547a0b2a79c00358b4607e0f7125309674fdff7398b7c5268b23010452e3a
MD5 dfea3b2204f884f4c4152f575613260b
BLAKE2b-256 1d15d14df15df32b35ad0614f4980912c09cf6cb3ac71434b50774dcd8b9dc6e

See more details on using hashes here.

Provenance

The following attestation bundles were made for smart_webfetch_mcp-0.2.0-py3-none-any.whl:

Publisher: publish.yml on mathisto/smart-webfetch-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page