Skip to main content

Open-source MCP Server thay thế Tavily - Web search, extract, crawl với SearXNG

Project description

WET - Web ExTract MCP Server

Open-source MCP Server for web scraping & multimodal extraction.

PyPI License: MIT

Features

  • Web Search - Search via SearXNG (metasearch: Google, Bing, DuckDuckGo, Brave)
  • Content Extract - Extract clean content (Markdown/Text)
  • Deep Crawl - Crawl multiple pages from a root URL with depth control
  • Site Map - Discover website URL structure
  • Media - List and download images, videos, audio files
  • Anti-bot - Stealth mode bypasses Cloudflare, Medium, LinkedIn, Twitter

Quick Start

Prerequisites

  • Docker running (for SearXNG auto-management)
  • Python 3.13+ (or use uvx)

Add to mcp.json

{
  "mcpServers": {
    "wet": {
      "command": "uvx",
      "args": ["wet-mcp"]
    }
  }
}

That's it! On first run:

  1. Automatically installs Playwright chromium
  2. Automatically pulls SearXNG Docker image
  3. Starts wet-searxng container
  4. Runs the MCP server

Without uvx

pip install wet-mcp
wet-mcp

Tools

Tool Actions Description
web search, extract, crawl, map Web operations
media list, download, analyze Media discovery & download
help - Full documentation

Usage Examples

{"action": "search", "query": "python web scraping", "max_results": 10}
{"action": "extract", "urls": ["https://example.com"]}
{"action": "crawl", "urls": ["https://docs.python.org"], "depth": 2}
{"action": "map", "urls": ["https://example.com"]}
{"action": "list", "url": "https://github.com/python/cpython"}
{"action": "download", "media_urls": ["https://example.com/image.png"]}

Configuration

Variable Default Description
WET_AUTO_DOCKER true Auto-manage SearXNG container
WET_SEARXNG_PORT 8080 SearXNG container port
SEARXNG_URL http://localhost:8080 External SearXNG URL
API_KEYS - LLM API keys for media analysis
LOG_LEVEL INFO Logging level

LLM Configuration (Optional)

For media analysis (images, videos, audio), configure API keys:

API_KEYS=GOOGLE_API_KEY:AIza...
LLM_MODELS=gemini/gemini-3-flash-preview

Architecture

┌─────────────────────────────────────────────────────────┐
│                    MCP Client                           │
│            (Claude, Cursor, Windsurf)                   │
└─────────────────────┬───────────────────────────────────┘
                      │ MCP Protocol
                      ▼
┌─────────────────────────────────────────────────────────┐
│                   WET MCP Server                        │
│  ┌──────────┐  ┌──────────┐  ┌──────────────────────┐   │
│  │   web    │  │  media   │  │        help          │   │
│  │ (search, │  │ (list,   │  │  (full documentation)│   │
│  │ extract, │  │ download,│  └──────────────────────┘   │
│  │ crawl,   │  │ analyze) │                             │
│  │ map)     │  └────┬─────┘                             │
│  └────┬─────┘       │                                   │
│       │             │                                   │
│       ▼             ▼                                   │
│  ┌──────────┐  ┌──────────┐                             │
│  │ SearXNG  │  │ Crawl4AI │                             │
│  │ (Docker) │  │(Playwright)│                           │
│  └──────────┘  └──────────┘                             │
└─────────────────────────────────────────────────────────┘

Container Management

# View SearXNG logs
docker logs wet-searxng

# Stop SearXNG
docker stop wet-searxng

# Remove container (will be recreated on next run)
docker rm wet-searxng

# Reset auto-setup (forces re-install Playwright)
rm ~/.wet-mcp/.setup-complete

Build from Source

git clone https://github.com/n24q02m/wet-mcp
cd wet-mcp

# Setup (requires mise: https://mise.jdx.dev/)
mise run setup

# Run
uv run wet-mcp

Requirements: Python 3.13+, Docker


Contributing

See CONTRIBUTING.md

License

MIT - See LICENSE

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wet_mcp-3.0.1b0.tar.gz (15.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wet_mcp-3.0.1b0-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file wet_mcp-3.0.1b0.tar.gz.

File metadata

  • Download URL: wet_mcp-3.0.1b0.tar.gz
  • Upload date:
  • Size: 15.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-3.0.1b0.tar.gz
Algorithm Hash digest
SHA256 d38c890942c2c86e09f8cb66f39e4aa32f16f456bf4c9da0537fa159e09a16ee
MD5 6fd6e50d7091c52442c610baa7fe650a
BLAKE2b-256 773cb4ab544d20df3a7095dd900b38e9673ba9c5486a7b2793e1a0ee734fd3fc

See more details on using hashes here.

File details

Details for the file wet_mcp-3.0.1b0-py3-none-any.whl.

File metadata

  • Download URL: wet_mcp-3.0.1b0-py3-none-any.whl
  • Upload date:
  • Size: 20.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-3.0.1b0-py3-none-any.whl
Algorithm Hash digest
SHA256 50ceb07562d113be3b9ff337395ba0d3107a43a07a329e8ef39cbe6dd03daf87
MD5 e2630890749fea5626a727ca4ebf71f5
BLAKE2b-256 ff83a166fb235e86de9d8dbf1dc36534a9096b8042e102efc703b3e9261f3662

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page