MCP server for AI agents: high-fidelity web search (Exa) + tiered web fetch (Exa → optional local browser → Firecrawl) with an SSRF guard. A drop-in replacement for built-in WebSearch/WebFetch.

These details have not been verified by PyPI

Project links

Project description

web-retrieval-mcp — MCP web search & web fetch for AI agents (Exa + Firecrawl)

web-retrieval-mcp is an open-source Model Context Protocol (MCP) server that gives AI agents two web tools — neural web search (Exa) and a tiered web fetch (Exa → optional local browser → Firecrawl) — as a drop-in replacement for built-in WebSearch/WebFetch. It preserves per-source provenance, guards against SSRF, runs cross-platform (macOS/Linux/Windows), and works with Claude Code, Claude Desktop, Cursor, and any MCP client. Runs on free API tiers.

Why replace the built-in web tools?

An agent's stock WebSearch / WebFetch tend to flatten many sources into one blurry summary, drop provenance, and silently fail on JavaScript-heavy or anti-bot pages. This server fixes that:

	Built-in web tools	web-retrieval-mcp
Search results	One merged summary, sources conflated	One block per result — each keeps its own title, URL, highlights, and text, plus a `Sources` trailer
Fetch reliability	Single attempt, gives up on hard pages	Tiered fallback: Exa contents → optional local browser → Firecrawl, with a `[served by: …]` provenance header
JS / anti-bot pages	Usually fails	Opt-in real headless browser (camoufox) on demand
Safety	—	SSRF guard rejects loopback / private / link-local / multicast hosts before any request
Cost	Bundled / metered by your model vendor	Free on Exa + Firecrawl free tiers (see below)

Runs on free API tiers — and the free tiers are more than enough

Both providers have a genuinely usable free, no-credit-card tier, and because fetches hit Exa first (Firecrawl is only the fallback), a single developer or agent rarely touches the Firecrawl quota at all:

Provider	Free tier (verified 2026)	Role in this server
Exa	1,000 requests / month, no card	Powers `web_search` and the first `web_fetch` tier
Firecrawl	1,000 pages / month, no card	Fallback fetch tier only — rarely reached
camoufox (local browser)	Unlimited & free — runs on your machine	Opt-in `render="always"` tier for JS/anti-bot pages

For a personal agent that's ~33 searches and 33 hard-page fetches every day, indefinitely, for $0/month. Heavy production workloads can upgrade either provider independently — the tiering and code don't change.

Features

🔎 web_search — neural / keyword / auto search via Exa, one provenance-preserving block per result.
🌐 web_fetch — single-URL readable content through a resilient tier chain with provenance headers.
🧱 Tiered fallback — Exa contents → (opt-in) local camoufox browser → Firecrawl, so hard pages still resolve.
🛡️ SSRF guard — non-public hosts (loopback, RFC-1918, link-local, multicast, NAT64) are refused up front.
🔑 Cross-platform secrets — env vars, a key file, the keyring library, or an OS secret tool. No keys on the command line.
🚫 Hook to disable the built-ins — bundled PreToolUse hook + one-command installer so agents must use these tools.
📦 One-command install — uvx, pipx, or pip; ships two console scripts.

Quickstart

Install straight from GitHub (works today — see Publishing for the PyPI status):

# Run with no install — uvx fetches and runs it on demand:
uvx --from git+https://github.com/VelvetSP/web-retrieval-mcp web-retrieval-mcp

# Or install the CLI (isolated, recommended):
pipx install git+https://github.com/VelvetSP/web-retrieval-mcp

# Once published to PyPI this shortens to:  pipx install web-retrieval-mcp

# Optional extras (append to the pipx/pip target):
pipx install "git+https://github.com/VelvetSP/web-retrieval-mcp#egg=web-retrieval-mcp[render]"   # local browser tier
pip  install "web-retrieval-mcp[keyring]"   # cross-platform native secret store (once on PyPI)
python -m camoufox fetch                    # one-time browser download (only if you use [render])

Get free API keys: Exa → https://exa.ai · Firecrawl → https://firecrawl.dev — then:

export EXA_API_KEY="exa-..."
export FIRECRAWL_API_KEY="fc-..."

Register with Claude Code

# After `pipx install …` above puts `web-retrieval-mcp` on your PATH:
claude mcp add web-retrieval -- web-retrieval-mcp

# Or with no prior install, straight from GitHub via uvx:
claude mcp add web-retrieval -- uvx --from git+https://github.com/VelvetSP/web-retrieval-mcp web-retrieval-mcp

Register with Claude Desktop / any MCP client

{
  "mcpServers": {
    "web-retrieval": {
      "command": "web-retrieval-mcp",
      "env": {
        "EXA_API_KEY": "exa-...",
        "FIRECRAWL_API_KEY": "fc-..."
      }
    }
  }
}

command above assumes web-retrieval-mcp is on PATH (after pipx install). Otherwise set command to uvx with args: ["--from", "git+https://github.com/VelvetSP/web-retrieval-mcp", "web-retrieval-mcp"].

Tools

Tool	Signature	What it returns
`web_search`	`web_search(query, num_results=8, mode="auto")`	Neural web search via Exa. One block per result — each with its own title, URL, published date, highlights, and text — plus a `Sources` list. `mode` ∈ `auto` \| `neural` \| `keyword`.
`web_fetch`	`web_fetch(url, render="auto", max_chars=20000, max_age_hours=None)`	One URL's readable content through the tier chain, with a `[served by: …]` provenance header.

`web_fetch` details

Fetch one URL's readable content through the tier chain, returned with a [served by: …] header.

render="auto"   (default) →  Exa /contents  →  Firecrawl                 # no local browser
render="never"            →  Exa /contents  →  Firecrawl                 # same, explicit
render="always"           →  camoufox (local browser)  →  Firecrawl      # for JS / anti-bot pages

max_age_hours controls Exa's freshness window (0 = force fresh; None = Exa default cache).

Cross-platform API keys

Keys are resolved in-process (never on the command line, which is visible via ps), cheapest/safest source first — the same code path on macOS, Linux, and Windows:

Environment variables — EXA_API_KEY, FIRECRAWL_API_KEY. Universal; required for headless / CI.
Key file — a dotenv-style KEY=value file at $WEB_RETRIEVAL_MCP_ENV_FILE or <config-dir>/keys.env (~/.config/web-retrieval-mcp/ on Linux/macOS, %APPDATA%\web-retrieval-mcp\ on Windows).
keyring library — native store on every OS: macOS Keychain, Windows Credential Locker, Linux Secret Service / KWallet. Install the [keyring] extra, then store under service web-retrieval-mcp:
```
keyring set web-retrieval-mcp EXA_API_KEY
keyring set web-retrieval-mcp FIRECRAWL_API_KEY
```
OS-native secret CLI — macOS security, Linux secret-tool (libsecret), if present.

An unexpanded ${...} config literal is treated as absent.

Block the built-in web tools (Claude Code)

So agents and subagents can't silently fall back to the lower-fidelity built-ins, this repo ships a PreToolUse hook that denies WebSearch / WebFetch and points the agent here. Install it idempotently:

web-retrieval-mcp-install              # patch ~/.claude/settings.json (backs it up first)
web-retrieval-mcp-install --print      # preview only, write nothing
web-retrieval-mcp-install --register-mcp   # also run `claude mcp add`
web-retrieval-mcp-install --uninstall  # remove the hook

Break-glass: touch ~/.claude/.web-builtins-allow re-enables the built-ins for the session; remove the file to re-arm. The hook is pure POSIX sh (no jq).

Security — SSRF

web_fetch validates every URL before any request: non-http(s) schemes and any host resolving to a non-public IP (loopback, private/RFC-1918, link-local, reserved/NAT64, multicast) are refused. The only tier that runs a real browser on your machine (camoufox) is opt-in (render="always"), so the default path never exposes it. Residual: the camoufox tier follows redirects, so the up-front check covers the initial URL only — full closure would need a validating forward proxy. The default auto/never path never runs the browser.

FAQ

What is web-retrieval-mcp? An open-source MCP (Model Context Protocol) server that gives AI agents two web tools — web_search (Exa) and web_fetch (Exa → local browser → Firecrawl) — as a drop-in replacement for built-in web access, with provenance preservation and an SSRF guard.

Does it work with Claude Code? Yes. Register with claude mcp add web-retrieval -- web-retrieval-mcp, and optionally install the bundled hook so the built-in WebSearch/WebFetch are disabled in favor of these tools.

Is it free? Yes. The code is MIT-licensed, and it runs on the free tiers of Exa (1,000 requests/month) and Firecrawl (1,000 pages/month), neither of which requires a credit card. The local browser tier is free and unlimited.

Which platforms are supported? macOS, Linux, and Windows. Key resolution and the server are cross-platform; the local browser tier needs the optional [render] extra.

How is it better than built-in WebSearch/WebFetch? It returns one result block per source (no conflated summaries), preserves provenance, falls back across multiple fetch backends so hard/JS pages still resolve, and guards against SSRF.

Do I need the browser stack? No. Search and the default fetch path need only mcp + anyio. The camoufox/playwright browser is the optional [render] extra, used only for render="always".

Publishing

Status: distributed from GitHub today; not yet on PyPI, so pip install web-retrieval-mcp / uvx web-retrieval-mcp (the short forms) don't resolve yet — use the git-install commands in Quickstart.

To publish and make the package discoverable to agents, do these in order (see PUBLISHING.md for the full runbook):

PyPI — python -m build then uv publish (or twine upload dist/*). Unlocks the short pip install web-retrieval-mcp / uvx web-retrieval-mcp.
Official MCP Registry (registry.modelcontextprotocol.io) — the one high-leverage listing; aggregators (PulseMCP, Glama, mcp.so, Smithery) ingest from it. Publish server.json with mcp-publisher (GitHub OAuth, namespace io.github.velvetsp/...). PyPI-gated.
Tag a release — git tag v0.1.0 && git push --tags, then cut a GitHub Release (release pages are indexed by Google and add a freshness signal).

Contributing

Issues and PRs welcome at https://github.com/VelvetSP/web-retrieval-mcp. The server is a single module (src/web_retrieval_mcp/server.py); stdout is JSON-RPC only — keep all diagnostics on stderr.

License

MIT © VelvetSP

_{Keywords: MCP server, Model Context Protocol, AI agent web search, LLM web fetch, Exa API, Firecrawl API, camoufox, Claude Code MCP, web scraping for agents, RAG retrieval, SSRF-safe fetch, cross-platform, free web search API.}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

Jun 2, 2026

0.1.1

Jun 2, 2026

This version

0.1.0

Jun 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

web_retrieval_mcp-0.1.0.tar.gz (17.8 kB view details)

Uploaded Jun 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

web_retrieval_mcp-0.1.0-py3-none-any.whl (18.8 kB view details)

Uploaded Jun 2, 2026 Python 3

File details

Details for the file web_retrieval_mcp-0.1.0.tar.gz.

File metadata

Download URL: web_retrieval_mcp-0.1.0.tar.gz
Upload date: Jun 2, 2026
Size: 17.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for web_retrieval_mcp-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`065515e95fe3565228cf7bf0a4ce2f7d3bff33581c8e2bf7e9692c013cb2191c`
MD5	`30d06a0cefdb3a99d0cfbfd7e19e6ce0`
BLAKE2b-256	`74b3f8c9cd4f6ac2dc9496aa506006534bc34b946ce9f42bebba4510960359da`

See more details on using hashes here.

File details

Details for the file web_retrieval_mcp-0.1.0-py3-none-any.whl.

File metadata

Download URL: web_retrieval_mcp-0.1.0-py3-none-any.whl
Upload date: Jun 2, 2026
Size: 18.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for web_retrieval_mcp-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9eea3538e689168db62589c395a11d5d814e77f56cf7185d37a67f378a278214`
MD5	`f9a84b244fc0c13578ce14c895d07d19`
BLAKE2b-256	`13344d0d98bc1d44cff94672fce1d47c4eb03c049c959e866663714d580e3dca`

See more details on using hashes here.

web-retrieval-mcp 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

web-retrieval-mcp — MCP web search & web fetch for AI agents (Exa + Firecrawl)

Why replace the built-in web tools?

Runs on free API tiers — and the free tiers are more than enough

Features

Quickstart

Register with Claude Code

Register with Claude Desktop / any MCP client

Tools

web_fetch details

Cross-platform API keys

Block the built-in web tools (Claude Code)

Security — SSRF

FAQ

Publishing

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`web_fetch` details