Skip to main content

Open-source MCP Server thay thế Tavily - Web search, extract, crawl với SearXNG

Project description

WET - Web ExTract MCP Server

PyPI version License: MIT

Open-source MCP Server thay thế Tavily cho web scraping & multimodal extraction

Zero-install experience: chỉ cần uvx wet-mcp - tự động setup và quản lý SearXNG container.

Features

Feature Description
Web Search Tìm kiếm qua SearXNG (metasearch: Google, Bing, DuckDuckGo, Brave)
Content Extract Trích xuất nội dung sạch (Markdown/Text/HTML)
Deep Crawl Đi qua nhiều trang con từ URL gốc với depth control
Site Map Khám phá cấu trúc URL của website
Media List và download images, videos, audio files
Anti-bot Stealth mode bypass Cloudflare, Medium, LinkedIn, Twitter

Quick Start

Prerequisites

  • Docker daemon running (for SearXNG)
  • Python 3.13+ (hoặc dùng uvx)

MCP Client Configuration

Claude Desktop / Cursor / Windsurf / Antigravity:

{
  "mcpServers": {
    "wet": {
      "command": "uvx",
      "args": ["wet-mcp"]
    }
  }
}

Đó là tất cả! Khi MCP client gọi wet-mcp lần đầu:

  1. Tự động install Playwright chromium
  2. Tự động pull SearXNG Docker image
  3. Start wet-searxng container
  4. Chạy MCP server

Without uvx

pip install wet-mcp
wet-mcp

Tools

Tool Actions Description
web search, extract, crawl, map Web operations
media list, download Media discovery & download
help - Full documentation

Examples

# Search
{"action": "search", "query": "python web scraping", "max_results": 10}

# Extract content
{"action": "extract", "urls": ["https://example.com"]}

# Crawl with depth
{"action": "crawl", "urls": ["https://docs.python.org"], "depth": 2}

# Map site structure
{"action": "map", "urls": ["https://example.com"]}

# List media
{"action": "list", "url": "https://github.com/python/cpython"}

# Download media
{"action": "download", "media_urls": ["https://example.com/image.png"]}

Tech Stack

Component Technology
Language Python 3.13
MCP Framework FastMCP
Web Search SearXNG (auto-managed Docker)
Web Crawling Crawl4AI
Docker Management python-on-whales

How It Works

┌─────────────────────────────────────────────────────────┐
│                    MCP Client                           │
│            (Claude, Cursor, Windsurf)                   │
└─────────────────────┬───────────────────────────────────┘
                      │ MCP Protocol
                      ▼
┌─────────────────────────────────────────────────────────┐
│                   WET MCP Server                        │
│  ┌──────────┐  ┌──────────┐  ┌──────────────────────┐   │
│  │   web    │  │  media   │  │        help          │   │
│  │ (search, │  │ (list,   │  │  (full documentation)│   │
│  │ extract, │  │ download)│  └──────────────────────┘   │
│  │ crawl,   │  └────┬─────┘                             │
│  │ map)     │       │                                   │
│  └────┬─────┘       │                                   │
│       │             │                                   │
│       ▼             ▼                                   │
│  ┌──────────┐  ┌──────────┐                             │
│  │ SearXNG  │  │ Crawl4AI │                             │
│  │ (Docker) │  │(Playwright)│                           │
│  └──────────┘  └──────────┘                             │
└─────────────────────────────────────────────────────────┘

Configuration

Environment variables:

Variable Default Description
WET_AUTO_DOCKER true Auto-manage SearXNG container
WET_SEARXNG_PORT 8080 SearXNG container port
SEARXNG_URL http://localhost:8080 External SearXNG URL
LOG_LEVEL INFO Logging level

Container Management

# View SearXNG logs
docker logs wet-searxng

# Stop SearXNG
docker stop wet-searxng

# Remove container (will be recreated on next run)
docker rm wet-searxng

# Reset auto-setup (forces re-install Playwright)
rm ~/.wet-mcp/.setup-complete

License

MIT License

Project details


Release history Release notifications | RSS feed

This version

1.0.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wet_mcp-1.0.0.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wet_mcp-1.0.0-py3-none-any.whl (16.3 kB view details)

Uploaded Python 3

File details

Details for the file wet_mcp-1.0.0.tar.gz.

File metadata

  • Download URL: wet_mcp-1.0.0.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-1.0.0.tar.gz
Algorithm Hash digest
SHA256 b095a7f0d876c4c3b2de4054048c2143ea8ed0b54eff4ea86caad5d82e1ab908
MD5 758ddaaa9358758f775bd5218efa8f04
BLAKE2b-256 ad38b7ee25a61e0aff9943545d7be3e6b846b408df9a0b6a4ce91e61d3eb433b

See more details on using hashes here.

File details

Details for the file wet_mcp-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: wet_mcp-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 16.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f9a9243c56a0f33b4dde6c4f44a726f04b050b8176ddf3e4e198660543ef2418
MD5 5f1d246d3877ca4e89960d049e0c44d1
BLAKE2b-256 7260914eb921e731799b1de7d49217e448f8fa6d4adad0b238bf5bb65387809f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page