Skip to main content

MCP server providing web search (DuckDuckGo), HTTP fetch, browser fetch (Playwright), and file download.

Project description

www-search-mcp

MCP server for web search, HTTP fetch, browser fetch (Playwright), file download, API requests, and package search (PyPI, GitHub).

Optimized for batching: every tool accepts lists (multi-query / multi-URL) to reduce round-trips.

Quick Install

# Run without installing (recommended)
uvx www-search-mcp

# Or install as a global tool
uv tool install www-search-mcp

VS Code / Cursor config:

{
  "mcpServers": {
    "www-search": {
      "command": "uvx",
      "args": ["www-search-mcp"]
    }
  }
}

Tools Overview

Tool What it does
web_search General web search (DuckDuckGo)
web_search_images Image search
web_search_github GitHub repo search
web_search_pypi PyPI package search
web_fetch Fetch URL as Markdown
web_fetch_browser Browser-rendered fetch (JS sites)
web_download Download files to disk
web_request REST/GraphQL API calls + load tests
web_mcp_info Server config and tool docs
web_mcp_status Real-time diagnostics

Batching: pass queries=["q1", "q2"] or urls=["url1", "url2"] instead of calling one-by-one.


Search Tools

web_search

queries: str | list[str]
max_results: int = 5

Returns title, url, snippet. Safe search disabled.

web_search_images

queries: str | list[str]
max_results: int = 5

Returns title, image, thumbnail, height, width, source.

web_search_github

queries: str | list[str]
max_results: int = 5

Returns title, url, stars, forks, language. Uses GitHub REST API (rate-limited to ~10/min without token).

web_search_pypi

queries: str | list[str]
max_results: int = 5

Returns name, version, summary, author, license, requires_python, dependencies.


Fetch & Download Tools

web_fetch

urls: str | list[str]
fetch_div: str = ""          # CSS selector (e.g. "article")
save_file: str = ""         # Absolute path to save
use_session: bool = False   # Reuse cookies

Returns Markdown body. Rejects binary content — use web_download for files.

web_fetch_browser

urls: str | list[str]
fetch_div: str = ""
save_file: str = ""
headless: bool = True
wait_seconds: int = 0
use_session: bool = False

Same output as web_fetch but renders JS. Use for SPAs, login walls, bot-blocked sites.

web_download

urls: str | list[str]
save_files: str | list[str]  # Required target path(s)
use_session: bool = False

Returns saved_to, bytes, content_type.


API Tool

web_request

queries: dict | list[dict]

Spec fields per query:

  • type: "rest" | "graphql"
  • method: "GET" | "POST" | ...
  • url: target URL
  • headers: optional dict
  • requests: repeat count (default 1)
  • concurrency: async workers (default 1)
  • time: duration in seconds (0 = fixed count)

REST: body: dict|list|str GraphQL: query: str, variables: dict, operationName: str

Auth: WEB_REQUEST_TOKEN used as default Authorization if not provided in headers.

Output: status_counts, http_status_counts, latency_ms percentiles. Small runs (<=3) include response samples.


Diagnostic Tools

web_mcp_info

Server configuration, tool descriptions, environment variables.

web_mcp_status

Real-time diagnostics:

  • uptime_seconds, pid, python_version
  • throttle: last request, interval
  • sessions: total, with browser, stale
  • connections: niquests version, pool size
  • resources: FD limit, memory RSS, event loop tasks
  • counters: requests, errors, timeouts
  • config: transport, timeouts
  • health: DDGS, Playwright availability
  • metrics: latency percentiles (p50/p95/p99), subtask stats

Diagnostic HTTP Routes

When running in HTTP transport mode (WEB_TRANSPORT=streamable-http):

  • GET /healthz — Health check
  • GET /readyz — Readiness probe (200/503)
  • GET /status — Same JSON as web_mcp_status
  • GET /tasks — Active event loop tasks (debugging)
  • GET /memory — Memory breakdown (RSS, arenas)
  • GET /error-types — Error hierarchy for client introspection

Configuration

Environment Variables

Variable Default Description
WEB_TIMEOUT_TOTAL 30 Total timeout (sec)
WEB_TIMEOUT_CONNECT 5 Connect timeout (sec)
WEB_TIMEOUT_READ 25 Read timeout (sec)
WEB_MAX_RESULTS 5 Default search results
WEB_REQUEST_LIMIT 50 Max concurrent requests
WEB_MIN_INTERVAL 1.0 Throttle gap (sec)
WEB_RETRIES 2 Retry attempts
WEB_MAX_FETCH_CHARS 200000 Max fetch body length
WEB_MAX_DOWNLOAD_MB 50 Max download size
WEB_DEBUG false Debug logging
WEB_LOG_FORMAT text text or json
WEB_SESSION_ENABLED false Persistent cookies
WEB_TRANSPORT stdio stdio or streamable-http
WEB_HTTP_HOST 127.0.0.1 HTTP bind host
WEB_HTTP_PORT 8000 HTTP bind port
WEB_MCP_IDLE_LIFETIME 300 stdio idle timeout (sec), 0 to disable
WEB_DNS_RESOLVER system google, cloudflare, yandex, quad9, system
WEB_DNS_STRATEGY only_ipv4, only_ipv6, or dual-stack
WEB_PROXY HTTP proxy URL
WEB_SSL_VERIFY true TLS verification
WEB_SSL_PATH Extra CA certs
WEB_USER_AGENT Chrome 135 Custom UA
WEB_UA_ROTATION false Rotate UA pool
WEB_UA_LIST Custom UA list (||| separated)
WEB_GITHUB_TOKEN GitHub API token
WEB_REQUEST_TOKEN Default auth token

Session Persistence

When use_session=True or WEB_SESSION_ENABLED=1:

  • web_fetch / web_download: reuse per-session niquests.AsyncSession
  • web_fetch_browser: reuse per-session Playwright BrowserContext

Sessions are scoped to the MCP session (FastMCP Context), not global.


HTTP Mode

export WEB_TRANSPORT=streamable-http
export WEB_HTTP_HOST=127.0.0.1
export WEB_HTTP_PORT=8000
www-search-mcp

Operational endpoints:

  • GET /healthz{ "status": "ok" }
  • GET /readyz — readiness probe (200/503)
  • GET /status — same JSON as web_mcp_status
  • GET /error-types — error hierarchy

MCP endpoint:

  • POST /mcp — streamable-http

Idle Timeout (stdio)

stdio process auto-terminates after WEB_MCP_IDLE_LIFETIME seconds of inactivity (default 5 min). Set to 0 to disable.


Development

# Setup
git clone https://github.com/naifs/www-search-mcp.git
cd www-search-mcp
uv sync --all-groups

# Format + lint
uv run ruff format src/ tests/
uv run ruff check src/ tests/

# Type check
uv run ty check src/

# Tests
uv run pytest tests/ -q          # parallel (default)
uv run pytest tests/ -q -n0      # sequential

# Security
uv run bandit -r src/

# Build + install
rm -rf dist/
uv build
uv tool install --force dist/*.whl

Troubleshooting

Problem Fix
uv not found Install from astral.sh
Browser not found uv run python -m playwright install chromium
GitHub rate limit Set WEB_GITHUB_TOKEN
Binary in web_fetch Use web_download instead
Tools not showing Check config JSON, reload client, enable WEB_DEBUG

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

www_search_mcp-1.3.1.tar.gz (173.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

www_search_mcp-1.3.1-py3-none-any.whl (69.7 kB view details)

Uploaded Python 3

File details

Details for the file www_search_mcp-1.3.1.tar.gz.

File metadata

  • Download URL: www_search_mcp-1.3.1.tar.gz
  • Upload date:
  • Size: 173.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for www_search_mcp-1.3.1.tar.gz
Algorithm Hash digest
SHA256 82653a0f2fb7645a44eb9c0de7748c5c0eea1f3efcacb2c83c520dc9c3dcb427
MD5 1714932d211b58cfee16e48aab1e85c4
BLAKE2b-256 97ad68ddd5c9a4c866d6a7a291f72f0c65513e8ca6cfacf910b8fe9ae60efef8

See more details on using hashes here.

Provenance

The following attestation bundles were made for www_search_mcp-1.3.1.tar.gz:

Publisher: ci-cd.yml on Naifs/www-search-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file www_search_mcp-1.3.1-py3-none-any.whl.

File metadata

  • Download URL: www_search_mcp-1.3.1-py3-none-any.whl
  • Upload date:
  • Size: 69.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for www_search_mcp-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bdef68ea51a6c461f724edc1435fac96315e439970da02827036678682dab40a
MD5 a336128e7b986709d71f723d22fbf90f
BLAKE2b-256 7f85e2434bbd335d0c4969df493da3272bd1e090ae95e4f904d31491ef339f2d

See more details on using hashes here.

Provenance

The following attestation bundles were made for www_search_mcp-1.3.1-py3-none-any.whl:

Publisher: ci-cd.yml on Naifs/www-search-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page