Open-source MCP Server thay thế Tavily - Web search, extract, crawl với SearXNG
Project description
WET - Web ExTract MCP Server
Open-source MCP Server thay thế Tavily cho web scraping & multimodal extraction
Zero-install experience: chỉ cần uvx wet-mcp - tự động setup và quản lý SearXNG container.
Features
| Feature | Description |
|---|---|
| Web Search | Tìm kiếm qua SearXNG (metasearch: Google, Bing, DuckDuckGo, Brave) |
| Content Extract | Trích xuất nội dung sạch (Markdown/Text/HTML) |
| Deep Crawl | Đi qua nhiều trang con từ URL gốc với depth control |
| Site Map | Khám phá cấu trúc URL của website |
| Media | List và download images, videos, audio files |
| Anti-bot | Stealth mode bypass Cloudflare, Medium, LinkedIn, Twitter |
Quick Start
Prerequisites
- Docker daemon running (for SearXNG)
- Python 3.13+ (hoặc dùng uvx)
MCP Client Configuration
Claude Desktop / Cursor / Windsurf / Antigravity:
{
"mcpServers": {
"wet": {
"command": "uvx",
"args": ["wet-mcp"]
}
}
}
Đó là tất cả! Khi MCP client gọi wet-mcp lần đầu:
- Tự động install Playwright chromium
- Tự động pull SearXNG Docker image
- Start
wet-searxngcontainer - Chạy MCP server
Without uvx
pip install wet-mcp
wet-mcp
Tools
| Tool | Actions | Description |
|---|---|---|
web |
search, extract, crawl, map | Web operations |
media |
list, download | Media discovery & download |
help |
- | Full documentation |
Examples
# Search
{"action": "search", "query": "python web scraping", "max_results": 10}
# Extract content
{"action": "extract", "urls": ["https://example.com"]}
# Crawl with depth
{"action": "crawl", "urls": ["https://docs.python.org"], "depth": 2}
# Map site structure
{"action": "map", "urls": ["https://example.com"]}
# List media
{"action": "list", "url": "https://github.com/python/cpython"}
# Download media
{"action": "download", "media_urls": ["https://example.com/image.png"]}
Tech Stack
| Component | Technology |
|---|---|
| Language | Python 3.13 |
| MCP Framework | FastMCP |
| Web Search | SearXNG (auto-managed Docker) |
| Web Crawling | Crawl4AI |
| Docker Management | python-on-whales |
How It Works
┌─────────────────────────────────────────────────────────┐
│ MCP Client │
│ (Claude, Cursor, Windsurf) │
└─────────────────────┬───────────────────────────────────┘
│ MCP Protocol
▼
┌─────────────────────────────────────────────────────────┐
│ WET MCP Server │
│ ┌──────────┐ ┌──────────┐ ┌──────────────────────┐ │
│ │ web │ │ media │ │ help │ │
│ │ (search, │ │ (list, │ │ (full documentation)│ │
│ │ extract, │ │ download)│ └──────────────────────┘ │
│ │ crawl, │ └────┬─────┘ │
│ │ map) │ │ │
│ └────┬─────┘ │ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ │
│ │ SearXNG │ │ Crawl4AI │ │
│ │ (Docker) │ │(Playwright)│ │
│ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────┘
Configuration
Environment variables:
| Variable | Default | Description |
|---|---|---|
WET_AUTO_DOCKER |
true |
Auto-manage SearXNG container |
WET_SEARXNG_PORT |
8080 |
SearXNG container port |
SEARXNG_URL |
http://localhost:8080 |
External SearXNG URL |
LOG_LEVEL |
INFO |
Logging level |
Container Management
# View SearXNG logs
docker logs wet-searxng
# Stop SearXNG
docker stop wet-searxng
# Remove container (will be recreated on next run)
docker rm wet-searxng
# Reset auto-setup (forces re-install Playwright)
rm ~/.wet-mcp/.setup-complete
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wet_mcp-1.0.0.tar.gz.
File metadata
- Download URL: wet_mcp-1.0.0.tar.gz
- Upload date:
- Size: 11.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b095a7f0d876c4c3b2de4054048c2143ea8ed0b54eff4ea86caad5d82e1ab908
|
|
| MD5 |
758ddaaa9358758f775bd5218efa8f04
|
|
| BLAKE2b-256 |
ad38b7ee25a61e0aff9943545d7be3e6b846b408df9a0b6a4ce91e61d3eb433b
|
File details
Details for the file wet_mcp-1.0.0-py3-none-any.whl.
File metadata
- Download URL: wet_mcp-1.0.0-py3-none-any.whl
- Upload date:
- Size: 16.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9a9243c56a0f33b4dde6c4f44a726f04b050b8176ddf3e4e198660543ef2418
|
|
| MD5 |
5f1d246d3877ca4e89960d049e0c44d1
|
|
| BLAKE2b-256 |
7260914eb921e731799b1de7d49217e448f8fa6d4adad0b238bf5bb65387809f
|