Skip to main content

Open-source MCP Server for web search, extract, crawl, academic research, and library docs with embedded SearXNG

Project description

WET - Web Extended Toolkit MCP Server

mcp-name: io.github.n24q02m/wet-mcp

5-strategy web search + extract + media MCP server, web-core ScrapingAgent backend.

Phase Status Scope
Phase 1 Shipped web-core ScrapingAgent migration, smart chunks output, search polish, media slim
Phase 2 Shipped Context7-level docs search: library index (Tier 1 + Tier 2), version-aware queries with token cap, project lock (Cabinets)
Phase 3 Current (BREAKING v2.0.0) extract.agent multi-step research with cited synthesis, extract.interact click/fill/submit via patchright (optional session persistence), docs_004_chunk_summaries migration, media.analyze removed

BREAKING in v2.0.0 -- media(action="analyze") was removed entirely. Use imagine-mcp's understand action for vision/audio/video analysis. See docs/migration.md for the upgrade recipe.

CI codecov PyPI Docker License: MIT

Python SearXNG MCP semantic-release Renovate

Sister projects from n24q02m (click to expand)
Project Tagline Tag
better-code-review-graph Knowledge graph for token-efficient code reviews -- fixed search, configurabl... MCP
better-email-mcp IMAP/SMTP email server for AI agents -- 6 composite tools with multi-account ... MCP
better-godot-mcp Composite MCP server for Godot Engine -- 17 mega-tools for AI-assisted game d... MCP
better-notion-mcp Markdown-first Notion API server for AI agents -- 10 composite tools replacin... MCP
better-telegram-mcp MCP server for Telegram with dual-mode support: Bot API (httpx) for quick bot... MCP
claude-plugins Full documentation: mcp.n24q02m.com — unified docs for all 8 servers + the mc... Marketplace
imagine-mcp Production-grade MCP server for image and video understanding + generation ac... MCP
jules-task-archiver Chrome Extension for bulk operations on Jules tasks via batchexecute API -- a... Tooling
mcp-core Unified MCP Streamable HTTP 2025-11-25 transport, OAuth 2.1 Authorization Ser... MCP
mnemo-mcp Persistent AI memory with hybrid search and embedded sync. Open, free, unlimi... MCP
qwen3-embed Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF Library
skret Secrets without the server. CLI
web-core Shared web infrastructure package for search, scraping, HTTP security, and st... Library
wet-mcp Open-source MCP Server for web search, content extraction, library docs & mul... MCP

Table of contents

WET MCP server

Features

  • Web Search -- Embedded SearXNG metasearch (Google, Bing, DuckDuckGo, Brave) with query expansion, TTL cache (1 h general / 5 min time-sensitive), standardized citation format, and 200-token snippet cap
  • Academic Research -- Search Google Scholar, Semantic Scholar, arXiv, PubMed, CrossRef, BASE
  • Library Docs -- Auto-discover and index documentation with FTS5 hybrid search, HyDE-enhanced retrieval, and version-specific docs
  • Content Extract -- 5-strategy escalation chain via n24q02m-web-core ScrapingAgent (basic_http -> tls_spoof -> headless Crawl4AI), markitdown bridge for low-tier HTML/MD fallback, smart chunks structured output (clean text + markdown + JSON-LD + code blocks + metadata), batch processing (up to 50 URLs), deep crawling, site mapping
  • Local File Conversion -- Convert PDF, DOCX, XLSX, CSV, HTML, EPUB, PPTX to Markdown
  • Media -- List + download images / videos / audio files. analyze deprecated v<auto>+ -- use imagine-mcp.understand for vision/audio inference
  • Anti-bot -- Stealth strategies bypass Cloudflare, Medium, LinkedIn, Twitter
  • Zero Config -- Built-in local Qwen3 embedding + reranking, no API keys needed. Optional cloud providers (Jina AI, Gemini, OpenAI, Cohere) for higher-quality vectors
  • Sync -- Cross-machine sync of indexed docs via Google Drive (OAuth Device Code, no browser redirect)

Quick install

# Method 1 (default): plugin install via Claude Code
/plugin marketplace add n24q02m/claude-plugins
/plugin install wet-mcp@n24q02m-plugins

# Method 1 (CLI): direct uvx invocation
claude mcp add wet -- uvx wet-mcp

# Method 3 (recommended for HTTP / multi-device / OAuth)
docker run -d --name wet-mcp-http -p 8084:8084 \
  -v wet-data:/data -e MCP_TRANSPORT=http \
  -e PUBLIC_URL=https://wet.example.com \
  n24q02m/wet-mcp:latest

Full setup matrices live at the canonical docs site mcp.n24q02m.com/servers/wet-mcp/setup/ and the paste-to-agent snippets at claude-plugins/plugins/wet-mcp/setup-with-agent.md (per Spec F single source of truth).

Status

2026-05-02 -- Architecture stabilization update

Past months saw significant churn around credential handling and the daemon-bridge auto-spawn pattern. This caused multi-process races, browser tab spam, and inconsistent setup UX across plugins. As of v<auto>, the architecture is stable: 2 clean modes (stdio + HTTP), no daemon-bridge layer, no auto-spawn from stdio.

Apologies for the instability period. If you encountered issues with prior versions, please update to v<auto>+ and follow the current setup docs -- most prior workarounds are no longer needed.

Related plugins from the same author:

All plugins share the same architecture (this spec) -- install once, learn pattern transfers.

Documentation

Full docs at mcp.n24q02m.com/servers/wet-mcp/:

  • Setup -- install methods for Claude Code, Codex, Gemini CLI, Cursor, Windsurf, mcp.json
  • Modes overview -- stdio / local-relay / remote-relay / remote-oauth
  • Multi-user setup -- per-JWT-sub credential model

In-repo references (Spec F single source of truth: setup docs live in claude-plugins/plugins/wet-mcp/):

  • docs/ARCHITECTURE.md -- web-core ScrapingAgent integration, strategy chain, storage layout, LLM provider dispatch
  • docs/BENCHMARKS.md -- v1.x baseline coverage / latency placeholders + tier-1 fixture metrics

Install with AI agent -- paste this to your AI coding agent:

Install MCP server wet-mcp following the steps at https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/wet-mcp/setup-with-agent.md

Tools

5 MCP tools (3 domain + config + help). The legacy setup tool merged into config action dispatch.

Tool Description
search Web (SearXNG metasearch), news, images, academic research (Scholar / arXiv / PubMed / CrossRef / Semantic Scholar / BASE), library docs (HyDE + FTS5), find similar pages. Includes docs_resolve (library name -> ranked id), docs_query (version-aware + topic + 5000-token cap), docs_lock_project (Cabinets project pin via pyproject / package.json / go.mod / Cargo.toml manifest detection).
extract URL -> smart chunks dict (clean_text + markdown + structured_data + code_blocks + metadata) via web-core 5-strategy chain. Batch processing (up to 50 URLs), deep crawling, site mapping, local file conversion (PDF/DOCX/XLSX/PPTX/EPUB), structured extraction (JSON Schema)
media list (discover URLs from gallery pages), download (SSRF-safe). analyze deprecated v<auto>+ -- forwards to imagine-mcp.understand
config status, set, cache_clear, docs_reindex, warmup, setup_open_relay, setup_status, setup_skip, setup_reset, setup_complete, setup_sync
help Per-tool documentation: search, extract, media, config

Media boundary: For vision / audio understanding (image captioning, OCR, audio transcription, video summarization), use imagine-mcp. media.analyze was removed in wet v2.0.0 -- use imagine-mcp.understand instead.

Comparison

How wet-mcp stacks up against direct competitors in each pillar:

Capability wet-mcp Brave Search Tavily Firecrawl Context7
Web search Yes (SearXNG aggregation) Yes Yes No No
Extract URL Yes (5-strategy chain) No Yes (basic) Yes No
Media list / download Yes No No No No
Library docs search Yes (Tier 1 curated + Tier 2 on-demand, version-aware, Cabinets) No No No Yes
Academic research Yes (6 providers) No No No No
Self-hostable Yes No No No Yes
Free tier Yes (open source) Limited Limited Limited Yes

Security

  • SSRF prevention -- URL validation on crawl targets
  • Graceful fallbacks -- Cloud → Local embedding, multi-tier crawling
  • Error sanitization -- No credentials in error messages
  • File conversion sandboxing -- Optional CONVERT_ALLOWED_DIRS restriction

Build from Source

git clone https://github.com/n24q02m/wet-mcp.git
cd wet-mcp
uv sync
uv run wet-mcp

Trust Model

This plugin implements TC-Local (machine-bound, single trust principal). See mcp-core/docs/TRUST-MODEL.md for full classification.

Mode Storage Encryption Who can read your data?
stdio (default) ~/.wet-mcp/config.json AES-GCM, machine-bound key Only your OS user (file perm 0600)
HTTP self-host Same as stdio Same Only you (admin = user)

License

MIT -- See LICENSE.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wet_mcp-3.2.0.tar.gz (173.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wet_mcp-3.2.0-py3-none-any.whl (196.9 kB view details)

Uploaded Python 3

File details

Details for the file wet_mcp-3.2.0.tar.gz.

File metadata

  • Download URL: wet_mcp-3.2.0.tar.gz
  • Upload date:
  • Size: 173.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-3.2.0.tar.gz
Algorithm Hash digest
SHA256 3be8ea78ddb66aebfc715fbbe02e4327c69cdc2a41f692a1d8d11ff1b43a9494
MD5 e434f19e54e53c697568d4152ac1de26
BLAKE2b-256 c0300c4dc821c529ce7197ee2ca32a3cb0a8e885335c6e267a57a75726d754b5

See more details on using hashes here.

File details

Details for the file wet_mcp-3.2.0-py3-none-any.whl.

File metadata

  • Download URL: wet_mcp-3.2.0-py3-none-any.whl
  • Upload date:
  • Size: 196.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-3.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 06c51399cfdd77ce7f206088554ade89a6842bebc401a19c90b789133bafb912
MD5 538ac7512908a7dc525c263dd31c3548
BLAKE2b-256 582e23785d2a3b6ba4b43cb9f31f6356b8100c6b1b12b747652f8386f6cae5dc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page