Zero-dependency MCP server for unobstructed web reading: direct browser-header fetch past robots.txt/bot blocks, full-page HTML->Markdown with ad/boilerplate stripping, plus web search
Project description
NetLens
An MCP server for unobstructed web reading. It fetches any URL directly with
browser-like headers — past robots.txt and naive bot blocks — and returns the
full page as clean, ad-stripped Markdown, not a summary. Plus web search that
returns real links. Zero dependencies: pure Python standard library.
Built for AI Agents
AI agents constantly hit pages their built-in tools can't read. NetLens fixes the three usual reasons a fetch comes back empty or useless:
| Native web tools | NetLens |
|---|---|
Honor robots.txt, so crawler-disallowed pages return nothing |
Reads like the browser you'd open yourself — doesn't consult robots.txt |
Blocked by header/User-Agent bot filters (403/202 to non-browser clients) |
Sends real browser headers via the system curl; commonly turns 403 → 200 |
| Return a summary of the page | Returns the full page content as Markdown |
| Leave ads, cookie banners, nav, and related-links chrome in the output | Strips boilerplate locally so only the content reaches your context |
It does not try to defeat JavaScript/Cloudflare challenge pages or CAPTCHAs — that's out of scope by design. When a page is a hard block, the HTTP status is surfaced honestly rather than faked.
Installation
npm (via npx):
{
"mcpServers": {
"netlens": {
"command": "npx",
"args": ["-y", "netlens-mcp"]
}
}
}
PyPI (via uvx):
{
"mcpServers": {
"netlens": {
"command": "uvx",
"args": ["netlens-mcp"]
}
}
}
Add either to your MCP client config (e.g. .mcp.json for Claude Code), then
restart the session so the tools load.
Tools
web_search
Search the web and return real result links (title, URL, snippet), parsed locally —
links, not summaries. Follow up with web_fetch to read a result.
| Argument | Type | Description |
|---|---|---|
query |
string (required) | The search query |
limit |
integer | Optional cap; default returns the full first page (~10) |
engine |
string | auto (default), duckduckgo, bing, mojeek, searxng |
A search fetches a single result page (~10 results), returned in full by default so nothing at position 9/10 is dropped. There's no deep pagination — if the answer isn't in the first page, refine the query.
web_fetch
Fetch any page and return its full content as clean Markdown.
| Argument | Type | Description |
|---|---|---|
url |
string (required) | URL to fetch (scheme optional; https assumed) |
mode |
string | article (main content only, default), full (whole body), raw (unconverted HTML) |
max_chars |
integer | Optional cap on returned characters (truncates with a note) |
Workflow: web_search to find pages, then web_fetch to read them.
Search engines
Search is a pluggable, selectable registry. In auto mode NetLens tries engines in
order and returns the first with results, so a rate-limit/challenge page on one
falls through to the next.
| Engine | Notes |
|---|---|
duckduckgo |
Default; html.duckduckgo.com endpoint |
bing |
Automatic fallback |
mojeek |
Independent index; automatic fallback |
searxng |
Self-hosted/public SearXNG JSON API — set NETLENS_SEARXNG_URL |
Pick per call with the engine argument, or set a default with
NETLENS_SEARCH_ENGINE.
Configuration
| Environment Variable | Default | Description |
|---|---|---|
NETLENS_SEARCH_ENGINE |
auto |
Default search backend |
NETLENS_SEARXNG_URL |
— | SearXNG base URL for engine=searxng |
How it works
- Direct fetch. Requests go straight to the target site via the system
curl(better TLS/HTTP-2/compression, so it looks like a real browser), falling back tourllib. No third-party proxy or reader is involved. - Local conversion. HTML → Markdown happens in-process with a hand-rolled
html.parserconverter — headings, lists, links (relative URLs resolved), code blocks, and GFM tables with colspan/rowspan. - Boilerplate stripping. Ads, cookie/consent banners, nav, footers, sidebars,
social/share and related/recommended widgets, and hidden elements are removed. In
articlemode NetLens also isolates the main content region (<main>/<article>/[role=main]). - Response charset is honored (from
Content-Typeor<meta>), so non-UTF-8 pages don't come back garbled.
Usage from the CLI
The server is also a plain script — handy for testing before a client loads it:
python -m netlens_mcp.server search "best bg3 starting class"
python -m netlens_mcp.server fetch https://www.ign.com/wikis/baldurs-gate-3
python -m netlens_mcp.server full https://example.com # whole body
python -m netlens_mcp.server raw https://example.com # unconverted HTML
python -m netlens_mcp runs the stdio MCP server; python -m netlens_mcp.server <cmd> runs the CLI.
Development
pip install -e ".[dev]"
python -m pytest # run the test suite
ruff check . # lint
Requirements
- Python 3.10+ (and the system
curl, which ships with modern Windows/macOS/Linux; falls back tourllibif absent)
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file netlens_mcp-0.1.2.tar.gz.
File metadata
- Download URL: netlens_mcp-0.1.2.tar.gz
- Upload date:
- Size: 29.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd42b3a897888ca35ee1196c9823449f613f27ddef8d7e87d07fa36021e3f966
|
|
| MD5 |
9cb0d44d470c1f6930ffade0f0197ab0
|
|
| BLAKE2b-256 |
48c1b8f14160225505334fa862de84ff320561ef7be8e9a282fdb1fef71caeb2
|
File details
Details for the file netlens_mcp-0.1.2-py3-none-any.whl.
File metadata
- Download URL: netlens_mcp-0.1.2-py3-none-any.whl
- Upload date:
- Size: 23.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
35578bec93fb8f9b447ef15041a3b34bde443d07fa47d76af63bafcc250ef68a
|
|
| MD5 |
0c19cf179eea083feb1ca2a2b95c64c5
|
|
| BLAKE2b-256 |
0dc0f4f9cd3290975bc6e46c4d5fd68ad0d90f6d6b27ac98a18a3ddf73e45566
|