Context-aware web fetching MCP server that respects token limits
Project description
Smart WebFetch MCP Server
Context-aware web fetching for LLMs. Prevents context window flooding by checking page size before fetching and providing surgical extraction tools.
The Problem
Standard web fetch tools dump entire pages into the context window, often:
- Exceeding token limits
- Wasting context on navigation, footers, ads
- Flooding the model with irrelevant content
The Solution
Smart WebFetch provides 5 tools for intelligent web fetching:
| Tool | Purpose |
|---|---|
web_preflight |
Check page size before fetching |
web_smart_fetch |
Fetch with automatic truncation |
web_fetch_code |
Extract only code blocks |
web_fetch_section |
Fetch specific heading/section |
web_fetch_chunked |
Paginated fetching for large docs |
Installation
# Install from PyPI
pip install smart-webfetch-mcp
# Or with uvx (recommended for MCP)
uvx smart-webfetch-mcp
Configuration
OpenCode
Add to your opencode.json:
{
"mcp": {
"smart-webfetch": {
"type": "local",
"command": ["uvx", "smart-webfetch-mcp"],
"enabled": true
}
}
}
Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"smart-webfetch": {
"command": "uvx",
"args": ["smart-webfetch-mcp"]
}
}
}
Usage Examples
Check before fetching
Use web_preflight to check https://docs.python.org/3/library/asyncio.html
Response:
{
"url": "https://docs.python.org/3/library/asyncio.html",
"estimated_tokens": 45000,
"safe_for_context": false,
"recommendation": "Very large page (~45,000 tokens). Use web_fetch_section or web_fetch_chunked."
}
Fetch with automatic truncation
Use web_smart_fetch on https://example.com/docs with max_tokens=4000
Extract only code examples
Use web_fetch_code on https://docs.python.org/3/library/asyncio-task.html
Get specific section
Use web_fetch_section on https://docs.python.org/3/library/asyncio.html
with heading="Running an asyncio Program"
Paginated reading
Use web_fetch_chunked on https://large-docs.com/api with chunk=0, chunk_size=4000
Then continue with chunk=1, chunk=2, etc.
Tool Reference
web_preflight
Check page metadata before fetching.
Parameters:
url(required): URL to check
Returns:
estimated_tokens: Approximate token countcontent_type: MIME typeis_html: Whether content is HTMLtitle: Page title (if HTML)safe_for_context: Boolean (true if < 8000 tokens)recommendation: Human-readable advice
web_smart_fetch
Fetch with automatic truncation for large pages.
Parameters:
url(required): URL to fetchmax_tokens(optional, default 8000): Maximum tokens to returnstrategy(optional, default "auto"): "auto" finds natural break points, "truncate" hard cuts
Returns: Markdown content with metadata header
web_fetch_code
Extract only code blocks from a page.
Parameters:
url(required): URL to extract code from
Returns: Code blocks with language annotations and context
web_fetch_section
Fetch content under a specific heading.
Parameters:
url(required): URL to fetch fromheading(required): Heading text to find (case-insensitive)
Returns: Section content or list of available sections if not found
web_fetch_chunked
Fetch large documents in chunks.
Parameters:
url(required): URL to fetchchunk(optional, default 0): Chunk index (0-based)chunk_size(optional, default 4000): Tokens per chunk
Returns: Chunk content with navigation metadata
Development
# Clone and install dev dependencies
git clone https://github.com/mathisto/smart-webfetch-mcp
cd smart-webfetch-mcp
pip install -e ".[dev]"
# Run tests
pytest
# Format code
ruff format .
ruff check --fix .
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file smart_webfetch_mcp-0.2.0.tar.gz.
File metadata
- Download URL: smart_webfetch_mcp-0.2.0.tar.gz
- Upload date:
- Size: 17.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d496dc96443b3a59ee31424ee5facd80c9077b7835c34313919ed05684c8cf35
|
|
| MD5 |
1ffd83fc92b78be84682319538c10a11
|
|
| BLAKE2b-256 |
2315696a1c0b3882a4ffdd2eba61ca5bc8085dcc4c10c29326439b3384e977ef
|
Provenance
The following attestation bundles were made for smart_webfetch_mcp-0.2.0.tar.gz:
Publisher:
publish.yml on mathisto/smart-webfetch-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
smart_webfetch_mcp-0.2.0.tar.gz -
Subject digest:
d496dc96443b3a59ee31424ee5facd80c9077b7835c34313919ed05684c8cf35 - Sigstore transparency entry: 730366382
- Sigstore integration time:
-
Permalink:
mathisto/smart-webfetch-mcp@56e8b7603f95d2bfd5ac12c02a14742376a19c2f -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/mathisto
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@56e8b7603f95d2bfd5ac12c02a14742376a19c2f -
Trigger Event:
push
-
Statement type:
File details
Details for the file smart_webfetch_mcp-0.2.0-py3-none-any.whl.
File metadata
- Download URL: smart_webfetch_mcp-0.2.0-py3-none-any.whl
- Upload date:
- Size: 18.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9d2547a0b2a79c00358b4607e0f7125309674fdff7398b7c5268b23010452e3a
|
|
| MD5 |
dfea3b2204f884f4c4152f575613260b
|
|
| BLAKE2b-256 |
1d15d14df15df32b35ad0614f4980912c09cf6cb3ac71434b50774dcd8b9dc6e
|
Provenance
The following attestation bundles were made for smart_webfetch_mcp-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on mathisto/smart-webfetch-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
smart_webfetch_mcp-0.2.0-py3-none-any.whl -
Subject digest:
9d2547a0b2a79c00358b4607e0f7125309674fdff7398b7c5268b23010452e3a - Sigstore transparency entry: 730366383
- Sigstore integration time:
-
Permalink:
mathisto/smart-webfetch-mcp@56e8b7603f95d2bfd5ac12c02a14742376a19c2f -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/mathisto
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@56e8b7603f95d2bfd5ac12c02a14742376a19c2f -
Trigger Event:
push
-
Statement type: