Skip to main content

Open-source MCP Server for web search, extract, crawl, academic research, and library docs with embedded SearXNG

Project description

WET - Web Extended Toolkit MCP Server

mcp-name: io.github.n24q02m/wet-mcp

5-strategy web search + extract + media MCP server, web-core ScrapingAgent backend.

Phase Status Scope
Phase 1 Current (v1.x.y) web-core ScrapingAgent migration, smart chunks output, search polish, media slim
Phase 2 Planned Context7-level docs search (library index, version-aware queries, project context isolation)
Phase 3 Planned (BREAKING) extract.agent multi-step research, extract.interact click/fill/submit, media.analyze removal

CI codecov PyPI Docker License: MIT

Python SearXNG MCP semantic-release Renovate

Sister projects from n24q02m (click to expand)
Project Tagline Tag
better-code-review-graph Knowledge graph for token-efficient code reviews -- fixed search, configurabl... MCP
better-email-mcp IMAP/SMTP email server for AI agents -- 6 composite tools with multi-account ... MCP
better-godot-mcp Composite MCP server for Godot Engine -- 17 mega-tools for AI-assisted game d... MCP
better-notion-mcp Markdown-first Notion API server for AI agents -- 10 composite tools replacin... MCP
better-telegram-mcp MCP server for Telegram with dual-mode support: Bot API (httpx) for quick bot... MCP
claude-plugins Full documentation: mcp.n24q02m.com — unified docs for all 8 servers + the mc... Marketplace
imagine-mcp Production-grade MCP server for image and video understanding + generation ac... MCP
jules-task-archiver Chrome Extension for bulk operations on Jules tasks via batchexecute API -- a... Tooling
mcp-core Unified MCP Streamable HTTP 2025-11-25 transport, OAuth 2.1 Authorization Ser... MCP
mnemo-mcp Persistent AI memory with hybrid search and embedded sync. Open, free, unlimi... MCP
qwen3-embed Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF Library
skret Secrets without the server. CLI
web-core Shared web infrastructure package for search, scraping, HTTP security, and st... Library
wet-mcp Open-source MCP Server for web search, content extraction, library docs & mul... MCP

Table of contents

WET MCP server

Features

  • Web Search -- Embedded SearXNG metasearch (Google, Bing, DuckDuckGo, Brave) with query expansion, TTL cache (1 h general / 5 min time-sensitive), standardized citation format, and 200-token snippet cap
  • Academic Research -- Search Google Scholar, Semantic Scholar, arXiv, PubMed, CrossRef, BASE
  • Library Docs -- Auto-discover and index documentation with FTS5 hybrid search, HyDE-enhanced retrieval, and version-specific docs
  • Content Extract -- 5-strategy escalation chain via n24q02m-web-core ScrapingAgent (basic_http -> tls_spoof -> headless Crawl4AI), markitdown bridge for low-tier HTML/MD fallback, smart chunks structured output (clean text + markdown + JSON-LD + code blocks + metadata), batch processing (up to 50 URLs), deep crawling, site mapping
  • Local File Conversion -- Convert PDF, DOCX, XLSX, CSV, HTML, EPUB, PPTX to Markdown
  • Media -- List + download images / videos / audio files. analyze deprecated v<auto>+ -- use imagine-mcp.understand for vision/audio inference
  • Anti-bot -- Stealth strategies bypass Cloudflare, Medium, LinkedIn, Twitter
  • Zero Config -- Built-in local Qwen3 embedding + reranking, no API keys needed. Optional cloud providers (Jina AI, Gemini, OpenAI, Cohere) for higher-quality vectors
  • Sync -- Cross-machine sync of indexed docs via Google Drive (OAuth Device Code, no browser redirect)

Quick install

# Method 1 (default): plugin install via Claude Code
/plugin marketplace add n24q02m/claude-plugins
/plugin install wet-mcp@n24q02m-plugins

# Method 1 (CLI): direct uvx invocation
claude mcp add wet -- uvx wet-mcp

# Method 3 (recommended for HTTP / multi-device / OAuth)
docker run -d --name wet-mcp-http -p 8084:8084 \
  -v wet-data:/data -e MCP_TRANSPORT=http \
  -e PUBLIC_URL=https://wet.example.com \
  n24q02m/wet-mcp:latest

Full setup matrices live at the canonical docs site mcp.n24q02m.com/servers/wet-mcp/setup/ and the paste-to-agent snippets at claude-plugins/plugins/wet-mcp/setup-with-agent.md (per Spec F single source of truth).

Status

2026-05-02 -- Architecture stabilization update

Past months saw significant churn around credential handling and the daemon-bridge auto-spawn pattern. This caused multi-process races, browser tab spam, and inconsistent setup UX across plugins. As of v<auto>, the architecture is stable: 2 clean modes (stdio + HTTP), no daemon-bridge layer, no auto-spawn from stdio.

Apologies for the instability period. If you encountered issues with prior versions, please update to v<auto>+ and follow the current setup docs -- most prior workarounds are no longer needed.

Related plugins from the same author:

All plugins share the same architecture (this spec) -- install once, learn pattern transfers.

Documentation

Full docs at mcp.n24q02m.com/servers/wet-mcp/:

  • Setup -- install methods for Claude Code, Codex, Gemini CLI, Cursor, Windsurf, mcp.json
  • Modes overview -- stdio / local-relay / remote-relay / remote-oauth
  • Multi-user setup -- per-JWT-sub credential model

In-repo references (Spec F single source of truth: setup docs live in claude-plugins/plugins/wet-mcp/):

  • docs/ARCHITECTURE.md -- web-core ScrapingAgent integration, strategy chain, storage layout, LLM provider dispatch
  • docs/BENCHMARKS.md -- v1.x baseline coverage / latency placeholders + tier-1 fixture metrics

Install with AI agent -- paste this to your AI coding agent:

Install MCP server wet-mcp following the steps at https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/wet-mcp/setup-with-agent.md

Tools

5 MCP tools (3 domain + config + help). The legacy setup tool merged into config action dispatch.

Tool Description
search Web (SearXNG metasearch), news, images, academic research (Scholar / arXiv / PubMed / CrossRef / Semantic Scholar / BASE), library docs (HyDE + FTS5), find similar pages
extract URL -> smart chunks dict (clean_text + markdown + structured_data + code_blocks + metadata) via web-core 5-strategy chain. Batch processing (up to 50 URLs), deep crawling, site mapping, local file conversion (PDF/DOCX/XLSX/PPTX/EPUB), structured extraction (JSON Schema)
media list (discover URLs from gallery pages), download (SSRF-safe). analyze deprecated v<auto>+ -- forwards to imagine-mcp.understand
config status, set, cache_clear, docs_reindex, warmup, setup_open_relay, setup_status, setup_skip, setup_reset, setup_complete, setup_sync
help Per-tool documentation: search, extract, media, config

Media boundary: For vision / audio understanding (image captioning, OCR, audio transcription, video summarization), use imagine-mcp. media.analyze in wet has been deprecated since v<auto> and will be removed in wet v2.0.0 (Phase 3).

Comparison

How wet-mcp stacks up against direct competitors in each pillar:

Capability wet-mcp Brave Search Tavily Firecrawl Context7
Web search Yes (SearXNG aggregation) Yes Yes No No
Extract URL Yes (5-strategy chain) No Yes (basic) Yes No
Media list / download Yes No No No No
Library docs search Phase 2 No No No Yes
Academic research Yes (6 providers) No No No No
Self-hostable Yes No No No Yes
Free tier Yes (open source) Limited Limited Limited Yes

Security

  • SSRF prevention -- URL validation on crawl targets
  • Graceful fallbacks -- Cloud → Local embedding, multi-tier crawling
  • Error sanitization -- No credentials in error messages
  • File conversion sandboxing -- Optional CONVERT_ALLOWED_DIRS restriction

Build from Source

git clone https://github.com/n24q02m/wet-mcp.git
cd wet-mcp
uv sync
uv run wet-mcp

Trust Model

This plugin implements TC-Local (machine-bound, single trust principal). See mcp-core/docs/TRUST-MODEL.md for full classification.

Mode Storage Encryption Who can read your data?
stdio (default) ~/.wet-mcp/config.json AES-GCM, machine-bound key Only your OS user (file perm 0600)
HTTP self-host Same as stdio Same Only you (admin = user)

License

MIT -- See LICENSE.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wet_mcp-2.31.0b1.tar.gz (137.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wet_mcp-2.31.0b1-py3-none-any.whl (152.2 kB view details)

Uploaded Python 3

File details

Details for the file wet_mcp-2.31.0b1.tar.gz.

File metadata

  • Download URL: wet_mcp-2.31.0b1.tar.gz
  • Upload date:
  • Size: 137.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-2.31.0b1.tar.gz
Algorithm Hash digest
SHA256 d8c365e2587baf9b373a0913ca33dc6d6492e5b1f81e4e895b96d5afa6e9d112
MD5 713039000fce3a097f891e8860866e21
BLAKE2b-256 cfd5c7c9cd1fc4c01ae8fbc13c286955893c5b22efe326d60bb345aa43d27238

See more details on using hashes here.

File details

Details for the file wet_mcp-2.31.0b1-py3-none-any.whl.

File metadata

  • Download URL: wet_mcp-2.31.0b1-py3-none-any.whl
  • Upload date:
  • Size: 152.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-2.31.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 9f03b3c4d4f30512d9a3c743530af4dae02392e23d95ad5e6bed4e10405c56ea
MD5 091a419674991a284dc04b7ef83fec37
BLAKE2b-256 16d7640dacbafbc6bf4da189eb7b234a2f638dd696ceaf0961326735295635fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page