Open-source MCP Server for web search, extract, crawl, academic research, and library docs with embedded SearXNG
Project description
WET - Web Extended Toolkit MCP Server
mcp-name: io.github.n24q02m/wet-mcp
5-strategy web search + extract + media MCP server, web-core ScrapingAgent backend.
| Phase | Status | Scope |
|---|---|---|
| Phase 1 | Shipped | web-core ScrapingAgent migration, smart chunks output, search polish, media slim |
| Phase 2 | Shipped | Context7-level docs search: library index (Tier 1 + Tier 2), version-aware queries with token cap, project lock (Cabinets) |
| Phase 3 | Current (BREAKING v2.0.0) | extract.agent multi-step research with cited synthesis, extract.interact click/fill/submit via patchright (optional session persistence), docs_004_chunk_summaries migration, media.analyze removed |
BREAKING in v2.0.0 --
media(action="analyze")was removed entirely after the 2-minor-version deprecation grace period started in Phase 1. Useimagine-mcp'sunderstandaction for vision/audio/video analysis. Seedocs/migration.mdfor the upgrade recipe.
Sister projects from n24q02m (click to expand)
| Project | Tagline | Tag |
|---|---|---|
| better-code-review-graph | Knowledge graph for token-efficient code reviews -- fixed search, configurabl... | MCP |
| better-email-mcp | IMAP/SMTP email server for AI agents -- 6 composite tools with multi-account ... | MCP |
| better-godot-mcp | Composite MCP server for Godot Engine -- 17 mega-tools for AI-assisted game d... | MCP |
| better-notion-mcp | Markdown-first Notion API server for AI agents -- 10 composite tools replacin... | MCP |
| better-telegram-mcp | MCP server for Telegram with dual-mode support: Bot API (httpx) for quick bot... | MCP |
| claude-plugins | Full documentation: mcp.n24q02m.com — unified docs for all 8 servers + the mc... | Marketplace |
| imagine-mcp | Production-grade MCP server for image and video understanding + generation ac... | MCP |
| jules-task-archiver | Chrome Extension for bulk operations on Jules tasks via batchexecute API -- a... | Tooling |
| mcp-core | Unified MCP Streamable HTTP 2025-11-25 transport, OAuth 2.1 Authorization Ser... | MCP |
| mnemo-mcp | Persistent AI memory with hybrid search and embedded sync. Open, free, unlimi... | MCP |
| qwen3-embed | Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF | Library |
| skret | Secrets without the server. | CLI |
| web-core | Shared web infrastructure package for search, scraping, HTTP security, and st... | Library |
| wet-mcp | Open-source MCP Server for web search, content extraction, library docs & mul... | MCP |
Table of contents
- Features
- Status
- Quick install
- Documentation
- Tools
- Comparison
- Security
- Build from Source
- Trust Model
- License
Features
- Web Search -- Embedded SearXNG metasearch (Google, Bing, DuckDuckGo, Brave) with query expansion, TTL cache (1 h general / 5 min time-sensitive), standardized citation format, and 200-token snippet cap
- Academic Research -- Search Google Scholar, Semantic Scholar, arXiv, PubMed, CrossRef, BASE
- Library Docs -- Auto-discover and index documentation with FTS5 hybrid search, HyDE-enhanced retrieval, and version-specific docs
- Content Extract -- 5-strategy escalation chain via
n24q02m-web-coreScrapingAgent(basic_http->tls_spoof->headlessCrawl4AI), markitdown bridge for low-tier HTML/MD fallback, smart chunks structured output (clean text + markdown + JSON-LD + code blocks + metadata), batch processing (up to 50 URLs), deep crawling, site mapping - Local File Conversion -- Convert PDF, DOCX, XLSX, CSV, HTML, EPUB, PPTX to Markdown
- Media -- List + download images / videos / audio files.
analyzedeprecated v<auto>+ -- useimagine-mcp.understandfor vision/audio inference - Anti-bot -- Stealth strategies bypass Cloudflare, Medium, LinkedIn, Twitter
- Zero Config -- Built-in local Qwen3 embedding + reranking, no API keys needed. Optional cloud providers (Jina AI, Gemini, OpenAI, Cohere) for higher-quality vectors
- Sync -- Cross-machine sync of indexed docs via Google Drive (OAuth Device Code, no browser redirect)
Quick install
# Method 1 (default): plugin install via Claude Code
/plugin marketplace add n24q02m/claude-plugins
/plugin install wet-mcp@n24q02m-plugins
# Method 1 (CLI): direct uvx invocation
claude mcp add wet -- uvx wet-mcp
# Method 3 (recommended for HTTP / multi-device / OAuth)
docker run -d --name wet-mcp-http -p 8084:8084 \
-v wet-data:/data -e MCP_TRANSPORT=http \
-e PUBLIC_URL=https://wet.example.com \
n24q02m/wet-mcp:latest
Full setup matrices live at the canonical docs site mcp.n24q02m.com/servers/wet-mcp/setup/ and the paste-to-agent snippets at claude-plugins/plugins/wet-mcp/setup-with-agent.md (per Spec F single source of truth).
Status
2026-05-02 -- Architecture stabilization update
Past months saw significant churn around credential handling and the daemon-bridge auto-spawn pattern. This caused multi-process races, browser tab spam, and inconsistent setup UX across plugins. As of v<auto>, the architecture is stable: 2 clean modes (stdio + HTTP), no daemon-bridge layer, no auto-spawn from stdio.
Apologies for the instability period. If you encountered issues with prior versions, please update to v<auto>+ and follow the current setup docs -- most prior workarounds are no longer needed.
Related plugins from the same author:
- wet-mcp -- Web search + content extraction
- mnemo-mcp -- Persistent AI memory
- imagine-mcp -- Image/video understanding + generation
- better-notion-mcp -- Notion API
- better-email-mcp -- Email management
- better-telegram-mcp -- Telegram
- better-godot-mcp -- Godot Engine
- better-code-review-graph -- Code review knowledge graph
All plugins share the same architecture (this spec) -- install once, learn pattern transfers.
Documentation
Full docs at mcp.n24q02m.com/servers/wet-mcp/:
- Setup -- install methods for Claude Code, Codex, Gemini CLI, Cursor, Windsurf, mcp.json
- Modes overview -- stdio / local-relay / remote-relay / remote-oauth
- Multi-user setup -- per-JWT-sub credential model
In-repo references (Spec F single source of truth: setup docs live in claude-plugins/plugins/wet-mcp/):
docs/ARCHITECTURE.md-- web-core ScrapingAgent integration, strategy chain, storage layout, LLM provider dispatchdocs/BENCHMARKS.md-- v1.x baseline coverage / latency placeholders + tier-1 fixture metrics
Install with AI agent -- paste this to your AI coding agent:
Install MCP server
wet-mcpfollowing the steps at https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/wet-mcp/setup-with-agent.md
Tools
5 MCP tools (3 domain + config + help). The legacy setup tool merged
into config action dispatch.
| Tool | Description |
|---|---|
search |
Web (SearXNG metasearch), news, images, academic research (Scholar / arXiv / PubMed / CrossRef / Semantic Scholar / BASE), library docs (HyDE + FTS5), find similar pages. Phase 2 adds docs_resolve (library name -> ranked id), docs_query (version-aware + topic + 5000-token cap), docs_lock_project (Cabinets project pin via pyproject / package.json / go.mod / Cargo.toml manifest detection). |
extract |
URL -> smart chunks dict (clean_text + markdown + structured_data + code_blocks + metadata) via web-core 5-strategy chain. Batch processing (up to 50 URLs), deep crawling, site mapping, local file conversion (PDF/DOCX/XLSX/PPTX/EPUB), structured extraction (JSON Schema) |
media |
list (discover URLs from gallery pages), download (SSRF-safe). analyze deprecated v<auto>+ -- forwards to imagine-mcp.understand |
config |
status, set, cache_clear, docs_reindex, warmup, setup_open_relay, setup_status, setup_skip, setup_reset, setup_complete, setup_sync |
help |
Per-tool documentation: search, extract, media, config |
Media boundary: For vision / audio understanding (image captioning, OCR, audio transcription, video summarization), use imagine-mcp.
media.analyzein wet has been deprecated since v<auto> and will be removed in wet v2.0.0 (Phase 3).
Comparison
How wet-mcp stacks up against direct competitors in each pillar:
| Capability | wet-mcp | Brave Search | Tavily | Firecrawl | Context7 |
|---|---|---|---|---|---|
| Web search | Yes (SearXNG aggregation) | Yes | Yes | No | No |
| Extract URL | Yes (5-strategy chain) | No | Yes (basic) | Yes | No |
| Media list / download | Yes | No | No | No | No |
| Library docs search | Yes (Tier 1 curated + Tier 2 on-demand, version-aware, Cabinets) | No | No | No | Yes |
| Academic research | Yes (6 providers) | No | No | No | No |
| Self-hostable | Yes | No | No | No | Yes |
| Free tier | Yes (open source) | Limited | Limited | Limited | Yes |
Security
- SSRF prevention -- URL validation on crawl targets
- Graceful fallbacks -- Cloud → Local embedding, multi-tier crawling
- Error sanitization -- No credentials in error messages
- File conversion sandboxing -- Optional
CONVERT_ALLOWED_DIRSrestriction
Build from Source
git clone https://github.com/n24q02m/wet-mcp.git
cd wet-mcp
uv sync
uv run wet-mcp
Trust Model
This plugin implements TC-Local (machine-bound, single trust principal). See mcp-core/docs/TRUST-MODEL.md for full classification.
| Mode | Storage | Encryption | Who can read your data? |
|---|---|---|---|
| stdio (default) | ~/.wet-mcp/config.json |
AES-GCM, machine-bound key | Only your OS user (file perm 0600) |
| HTTP self-host | Same as stdio | Same | Only you (admin = user) |
License
MIT -- See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wet_mcp-3.0.0b1.tar.gz.
File metadata
- Download URL: wet_mcp-3.0.0b1.tar.gz
- Upload date:
- Size: 165.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd845b8805d90319713a3a054d424b3330658055eb808b995f8bde3190a5b582
|
|
| MD5 |
79681aee97fff7baa34083ba582eebc0
|
|
| BLAKE2b-256 |
0381b2c5bc8923b6d378451f72bcedfc776322d96c4b03a8d130eb2d53d4d4d7
|
File details
Details for the file wet_mcp-3.0.0b1-py3-none-any.whl.
File metadata
- Download URL: wet_mcp-3.0.0b1-py3-none-any.whl
- Upload date:
- Size: 179.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79178a710dc76793d6222c2263b021946331792a048372cb9a3e81cc5eda3a6d
|
|
| MD5 |
b6639541502e8f4ecd9ebce0ba8cc0d7
|
|
| BLAKE2b-256 |
dfd17a31f7c365cca40771705f0a12f4e17af968941b77d36824508746be037a
|