Skip to main content

Open-source MCP Server for web search, extract, and crawl with embedded SearXNG

Project description

WET - Web ExTract MCP Server

Open-source MCP Server for web scraping & multimodal extraction.

PyPI Docker License: MIT

Features

  • Web Search - Search via embedded SearXNG (metasearch: Google, Bing, DuckDuckGo, Brave)
  • Content Extract - Extract clean content (Markdown/Text)
  • Deep Crawl - Crawl multiple pages from a root URL with depth control
  • Site Map - Discover website URL structure
  • Media - List and download images, videos, audio files
  • Anti-bot - Stealth mode bypasses Cloudflare, Medium, LinkedIn, Twitter

Quick Start

Prerequisites

  • Python 3.13+ (or use uvx)

Add to mcp.json

uvx (Recommended)

{
  "mcpServers": {
    "wet": {
      "command": "uvx",
      "args": ["wet-mcp@latest"],
      "env": {
        "API_KEYS": "GOOGLE_API_KEY:AIza..."
      }
    }
  }
}

That's it! On first run:

  1. Automatically installs SearXNG from GitHub
  2. Automatically installs Playwright chromium + system dependencies
  3. Starts embedded SearXNG subprocess
  4. Runs the MCP server

Docker

{
  "mcpServers": {
    "wet": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "-e", "API_KEYS", "n24q02m/wet-mcp:latest"],
      "env": {
        "API_KEYS": "GOOGLE_API_KEY:AIza..."
      }
    }
  }
}

Without uvx

pip install wet-mcp
wet-mcp

Tools

Tool Actions Description
web search, extract, crawl, map Web operations
media list, download, analyze Media discovery & download
help - Full documentation

Usage Examples

{"action": "search", "query": "python web scraping", "max_results": 10}
{"action": "extract", "urls": ["https://example.com"]}
{"action": "crawl", "urls": ["https://docs.python.org"], "depth": 2}
{"action": "map", "urls": ["https://example.com"]}
{"action": "list", "url": "https://github.com/python/cpython"}
{"action": "download", "media_urls": ["https://example.com/image.png"]}

Configuration

Variable Default Description
WET_AUTO_SEARXNG true Auto-start embedded SearXNG subprocess
WET_SEARXNG_PORT 8080 SearXNG port
SEARXNG_URL http://localhost:8080 External SearXNG URL (when auto disabled)
API_KEYS - LLM API keys for media analysis
LOG_LEVEL INFO Logging level

LLM Configuration (Optional)

For media analysis (images, videos, audio), configure API keys:

API_KEYS=GOOGLE_API_KEY:AIza...
LLM_MODELS=gemini/gemini-3-flash-preview

Architecture

┌─────────────────────────────────────────────────────────┐
│                    MCP Client                           │
│            (Claude, Cursor, Windsurf)                   │
└─────────────────────┬───────────────────────────────────┘
                      │ MCP Protocol
                      ▼
┌─────────────────────────────────────────────────────────┐
│                   WET MCP Server                        │
│  ┌──────────┐  ┌──────────┐  ┌──────────────────────┐   │
│  │   web    │  │  media   │  │        help          │   │
│  │ (search, │  │ (list,   │  │  (full documentation)│   │
│  │ extract, │  │ download,│  └──────────────────────┘   │
│  │ crawl,   │  │ analyze) │                             │
│  │ map)     │  └────┬─────┘                             │
│  └────┬─────┘       │                                   │
│       │             │                                   │
│       ▼             ▼                                   │
│  ┌──────────┐  ┌──────────┐                             │
│  │ SearXNG  │  │ Crawl4AI │                             │
│  │(embedded)│  │(Playwright)│                           │
│  └──────────┘  └──────────┘                             │
└─────────────────────────────────────────────────────────┘

Build from Source

git clone https://github.com/n24q02m/wet-mcp
cd wet-mcp

# Setup (requires mise: https://mise.jdx.dev/)
mise run setup

# Run
uv run wet-mcp

Docker Build

docker build -t n24q02m/wet-mcp:latest .

Requirements: Python 3.13+


Contributing

See CONTRIBUTING.md

License

MIT - See LICENSE

Project details


Release history Release notifications | RSS feed

This version

2.2.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wet_mcp-2.2.0.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wet_mcp-2.2.0-py3-none-any.whl (24.3 kB view details)

Uploaded Python 3

File details

Details for the file wet_mcp-2.2.0.tar.gz.

File metadata

  • Download URL: wet_mcp-2.2.0.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-2.2.0.tar.gz
Algorithm Hash digest
SHA256 ded9019f09aef846cbe2294633df0a2db01da4b87591d12836dcfc8c2d0e4854
MD5 dc6bf97d6bc02075b8a900a6f54d1313
BLAKE2b-256 31d43edb3e9c0a2adc377c099eb7694035e2179c7da51a1e538238f95fa786f2

See more details on using hashes here.

File details

Details for the file wet_mcp-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: wet_mcp-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 24.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b16bd142ff4bb2f8c653442c6ffcef1a4d76579be1d12c6a65830503549efa95
MD5 5c8b8a10237a1d8f4eb25d24c0f0d3bd
BLAKE2b-256 d2c2daa6589f9254704dc766a941e8c721d95bc9ef6786e4946fa0d127e41ce7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page