Skip to main content

MCP server providing web search (DuckDuckGo), HTTP fetch, browser fetch (Playwright), and file download.

Project description

www-search-mcp

MCP server for web search, HTTP fetch, browser fetch (Playwright), file download, API requests, and package search (PyPI, GitHub).

Optimized for batching: every tool accepts lists (multi-query / multi-URL) to reduce round-trips.

Quick Install

# Run without installing (recommended)
uvx www-search-mcp

# Or install as a global tool
uv tool install www-search-mcp

VS Code / Cursor config:

{
  "mcpServers": {
    "www-search": {
      "command": "uvx",
      "args": ["www-search-mcp"]
    }
  }
}

Tools Overview

Tool What it does
web_search General web search (DuckDuckGo)
web_search_images Image search
web_search_github GitHub repo search
web_search_pypi PyPI package search
web_fetch Fetch URL as Markdown
web_fetch_browser Browser-rendered fetch (JS sites)
web_download Download files to disk
web_request REST/GraphQL API calls + load tests
web_mcp_info Server config and tool docs
web_mcp_status Real-time diagnostics

Batching: pass queries=["q1", "q2"] or urls=["url1", "url2"] instead of calling one-by-one.


Search Tools

web_search

queries: str | list[str]
max_results: int = 5

Returns title, url, snippet. Safe search disabled.

web_search_images

queries: str | list[str]
max_results: int = 5

Returns title, image, thumbnail, height, width, source.

web_search_github

queries: str | list[str]
max_results: int = 5

Returns title, url, stars, forks, language. Uses GitHub REST API (rate-limited to ~10/min without token).

web_search_pypi

queries: str | list[str]
max_results: int = 5

Returns name, version, summary, author, license, requires_python, dependencies.


Fetch & Download Tools

web_fetch

urls: str | list[str]
fetch_div: str = ""          # CSS selector (e.g. "article")
save_file: str = ""         # Absolute path to save
use_session: bool = False   # Reuse cookies

Returns Markdown body. Rejects binary content — use web_download for files.

web_fetch_browser

urls: str | list[str]
fetch_div: str = ""
save_file: str = ""
headless: bool = True
wait_seconds: int = 0
use_session: bool = False

Same output as web_fetch but renders JS. Use for SPAs, login walls, bot-blocked sites.

web_download

urls: str | list[str]
save_files: str | list[str]  # Required target path(s)
use_session: bool = False

Returns saved_to, bytes, content_type.


API Tool

web_request

queries: dict | list[dict]

Spec fields per query:

  • type: "rest" | "graphql"
  • method: "GET" | "POST" | ...
  • url: target URL
  • headers: optional dict
  • requests: repeat count (default 1)
  • concurrency: async workers (default 1)
  • time: duration in seconds (0 = fixed count)

REST: body: dict|list|str GraphQL: query: str, variables: dict, operationName: str

Auth: WEB_REQUEST_TOKEN used as default Authorization if not provided in headers.

Output: status_counts, http_status_counts, latency_ms percentiles. Small runs (<=3) include response samples.


Diagnostic Tools

web_mcp_info

Server configuration, tool descriptions, environment variables.

web_mcp_status

Real-time diagnostics:

  • uptime_seconds, pid, python_version
  • throttle: last request, interval
  • sessions: total, with browser, stale
  • connections: niquests version, pool size
  • resources: FD limit, memory RSS, event loop tasks
  • counters: requests, errors, timeouts
  • config: transport, timeouts
  • health: DDGS, Playwright availability

Configuration

Environment Variables

Variable Default Description
WEB_TIMEOUT_TOTAL 30 Total timeout (sec)
WEB_TIMEOUT_CONNECT 5 Connect timeout (sec)
WEB_TIMEOUT_READ 25 Read timeout (sec)
WEB_MAX_RESULTS 5 Default search results
WEB_REQUEST_LIMIT 50 Max concurrent requests
WEB_MIN_INTERVAL 1.0 Throttle gap (sec)
WEB_RETRIES 2 Retry attempts
WEB_MAX_FETCH_CHARS 200000 Max fetch body length
WEB_MAX_DOWNLOAD_MB 50 Max download size
WEB_DEBUG false Debug logging
WEB_LOG_FORMAT text text or json
WEB_SESSION_ENABLED false Persistent cookies
WEB_TRANSPORT stdio stdio or streamable-http
WEB_HTTP_HOST 127.0.0.1 HTTP bind host
WEB_HTTP_PORT 8000 HTTP bind port
WEB_MCP_IDLE_LIFETIME 300 stdio idle timeout (sec), 0 to disable
WEB_DNS_RESOLVER google google, cloudflare, yandex, quad9, system
WEB_DNS_STRATEGY only_ipv4, only_ipv6, or dual-stack
WEB_PROXY HTTP proxy URL
WEB_SSL_VERIFY true TLS verification
WEB_SSL_PATH Extra CA certs
WEB_USER_AGENT Chrome 135 Custom UA
WEB_UA_ROTATION false Rotate UA pool
WEB_UA_LIST Custom UA list (||| separated)
WEB_GITHUB_TOKEN GitHub API token
WEB_REQUEST_TOKEN Default auth token

Session Persistence

When use_session=True or WEB_SESSION_ENABLED=1:

  • web_fetch / web_download: reuse per-session niquests.AsyncSession
  • web_fetch_browser: reuse per-session Playwright BrowserContext

Sessions are scoped to the MCP session (FastMCP Context), not global.


HTTP Mode

export WEB_TRANSPORT=streamable-http
export WEB_HTTP_HOST=127.0.0.1
export WEB_HTTP_PORT=8000
www-search-mcp

Operational endpoints:

  • GET /healthz{ "status": "ok" }
  • GET /readyz — readiness probe (200/503)
  • GET /status — same JSON as web_mcp_status
  • GET /error-types — error hierarchy

MCP endpoint:

  • POST /mcp — streamable-http

Idle Timeout (stdio)

stdio process auto-terminates after WEB_MCP_IDLE_LIFETIME seconds of inactivity (default 5 min). Set to 0 to disable.


Development

# Setup
git clone https://github.com/naifs/www-search-mcp.git
cd www-search-mcp
uv sync --all-groups

# Format + lint
uv run ruff format src/ tests/
uv run ruff check src/ tests/

# Type check
uv run ty check src/

# Tests
uv run pytest tests/ -q          # parallel (default)
uv run pytest tests/ -q -n0      # sequential

# Security
uv run bandit -r src/
uv run pip-audit

# Build + install
rm -rf dist/
uv build
uv tool install --force dist/*.whl

Troubleshooting

Problem Fix
uv not found Install from astral.sh
Browser not found uv run python -m playwright install chromium
GitHub rate limit Set WEB_GITHUB_TOKEN
Binary in web_fetch Use web_download instead
Tools not showing Check config JSON, reload client, enable WEB_DEBUG

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

www_search_mcp-1.3.0.tar.gz (174.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

www_search_mcp-1.3.0-py3-none-any.whl (60.3 kB view details)

Uploaded Python 3

File details

Details for the file www_search_mcp-1.3.0.tar.gz.

File metadata

  • Download URL: www_search_mcp-1.3.0.tar.gz
  • Upload date:
  • Size: 174.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for www_search_mcp-1.3.0.tar.gz
Algorithm Hash digest
SHA256 d8bac1f5b7418fbe5f1c59052a4de301e8230d9b78cac678147dce998c5156a3
MD5 a0dfc18d5ec3aa3646c0202927548367
BLAKE2b-256 4a1057f15e41b72850104569ef9ba3e2a7dd0bf0f8c21adbbd375c867925d2ec

See more details on using hashes here.

Provenance

The following attestation bundles were made for www_search_mcp-1.3.0.tar.gz:

Publisher: ci-cd.yml on Naifs/www-search-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file www_search_mcp-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: www_search_mcp-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 60.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for www_search_mcp-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3cc85dce0b666d1e09c028ebe964b705c183bcf6ea6e7352c5adc642f100035d
MD5 b520d3a6509ef15bbce27d303e3e86ff
BLAKE2b-256 9fe3f0bc905afab01927ac0b51f7dcb00693ea383b198c605a8602a02358b652

See more details on using hashes here.

Provenance

The following attestation bundles were made for www_search_mcp-1.3.0-py3-none-any.whl:

Publisher: ci-cd.yml on Naifs/www-search-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page