Skip to main content

Open-source MCP Server for web search, extract, crawl, academic research, and library docs with embedded SearXNG

Project description

WET - Web Extended Toolkit MCP Server

mcp-name: io.github.n24q02m/wet-mcp

5-strategy web search + extract + media MCP server, web-core ScrapingAgent backend.

Phase Status Scope
Phase 1 Shipped web-core ScrapingAgent migration, smart chunks output, search polish, media slim
Phase 2 Shipped Context7-level docs search: library index (Tier 1 + Tier 2), version-aware queries with token cap, project lock (Cabinets)
Phase 3 Current (BREAKING v2.0.0) extract.agent multi-step research with cited synthesis, extract.interact click/fill/submit via patchright (optional session persistence), docs_004_chunk_summaries migration, media.analyze removed

BREAKING in v2.0.0 -- media(action="analyze") was removed entirely after the 2-minor-version deprecation grace period started in Phase 1. Use imagine-mcp's understand action for vision/audio/video analysis. See docs/migration.md for the upgrade recipe.

CI codecov PyPI Docker License: MIT

Python SearXNG MCP semantic-release Renovate

Sister projects from n24q02m (click to expand)
Project Tagline Tag
better-code-review-graph Knowledge graph for token-efficient code reviews -- fixed search, configurabl... MCP
better-email-mcp IMAP/SMTP email server for AI agents -- 6 composite tools with multi-account ... MCP
better-godot-mcp Composite MCP server for Godot Engine -- 17 mega-tools for AI-assisted game d... MCP
better-notion-mcp Markdown-first Notion API server for AI agents -- 10 composite tools replacin... MCP
better-telegram-mcp MCP server for Telegram with dual-mode support: Bot API (httpx) for quick bot... MCP
claude-plugins Full documentation: mcp.n24q02m.com — unified docs for all 8 servers + the mc... Marketplace
imagine-mcp Production-grade MCP server for image and video understanding + generation ac... MCP
jules-task-archiver Chrome Extension for bulk operations on Jules tasks via batchexecute API -- a... Tooling
mcp-core Unified MCP Streamable HTTP 2025-11-25 transport, OAuth 2.1 Authorization Ser... MCP
mnemo-mcp Persistent AI memory with hybrid search and embedded sync. Open, free, unlimi... MCP
qwen3-embed Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF Library
skret Secrets without the server. CLI
web-core Shared web infrastructure package for search, scraping, HTTP security, and st... Library
wet-mcp Open-source MCP Server for web search, content extraction, library docs & mul... MCP

Table of contents

WET MCP server

Features

  • Web Search -- Embedded SearXNG metasearch (Google, Bing, DuckDuckGo, Brave) with query expansion, TTL cache (1 h general / 5 min time-sensitive), standardized citation format, and 200-token snippet cap
  • Academic Research -- Search Google Scholar, Semantic Scholar, arXiv, PubMed, CrossRef, BASE
  • Library Docs -- Auto-discover and index documentation with FTS5 hybrid search, HyDE-enhanced retrieval, and version-specific docs
  • Content Extract -- 5-strategy escalation chain via n24q02m-web-core ScrapingAgent (basic_http -> tls_spoof -> headless Crawl4AI), markitdown bridge for low-tier HTML/MD fallback, smart chunks structured output (clean text + markdown + JSON-LD + code blocks + metadata), batch processing (up to 50 URLs), deep crawling, site mapping
  • Local File Conversion -- Convert PDF, DOCX, XLSX, CSV, HTML, EPUB, PPTX to Markdown
  • Media -- List + download images / videos / audio files. analyze deprecated v<auto>+ -- use imagine-mcp.understand for vision/audio inference
  • Anti-bot -- Stealth strategies bypass Cloudflare, Medium, LinkedIn, Twitter
  • Zero Config -- Built-in local Qwen3 embedding + reranking, no API keys needed. Optional cloud providers (Jina AI, Gemini, OpenAI, Cohere) for higher-quality vectors
  • Sync -- Cross-machine sync of indexed docs via Google Drive (OAuth Device Code, no browser redirect)

Quick install

# Method 1 (default): plugin install via Claude Code
/plugin marketplace add n24q02m/claude-plugins
/plugin install wet-mcp@n24q02m-plugins

# Method 1 (CLI): direct uvx invocation
claude mcp add wet -- uvx wet-mcp

# Method 3 (recommended for HTTP / multi-device / OAuth)
docker run -d --name wet-mcp-http -p 8084:8084 \
  -v wet-data:/data -e MCP_TRANSPORT=http \
  -e PUBLIC_URL=https://wet.example.com \
  n24q02m/wet-mcp:latest

Full setup matrices live at the canonical docs site mcp.n24q02m.com/servers/wet-mcp/setup/ and the paste-to-agent snippets at claude-plugins/plugins/wet-mcp/setup-with-agent.md (per Spec F single source of truth).

Status

2026-05-02 -- Architecture stabilization update

Past months saw significant churn around credential handling and the daemon-bridge auto-spawn pattern. This caused multi-process races, browser tab spam, and inconsistent setup UX across plugins. As of v<auto>, the architecture is stable: 2 clean modes (stdio + HTTP), no daemon-bridge layer, no auto-spawn from stdio.

Apologies for the instability period. If you encountered issues with prior versions, please update to v<auto>+ and follow the current setup docs -- most prior workarounds are no longer needed.

Related plugins from the same author:

All plugins share the same architecture (this spec) -- install once, learn pattern transfers.

Documentation

Full docs at mcp.n24q02m.com/servers/wet-mcp/:

  • Setup -- install methods for Claude Code, Codex, Gemini CLI, Cursor, Windsurf, mcp.json
  • Modes overview -- stdio / local-relay / remote-relay / remote-oauth
  • Multi-user setup -- per-JWT-sub credential model

In-repo references (Spec F single source of truth: setup docs live in claude-plugins/plugins/wet-mcp/):

  • docs/ARCHITECTURE.md -- web-core ScrapingAgent integration, strategy chain, storage layout, LLM provider dispatch
  • docs/BENCHMARKS.md -- v1.x baseline coverage / latency placeholders + tier-1 fixture metrics

Install with AI agent -- paste this to your AI coding agent:

Install MCP server wet-mcp following the steps at https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/wet-mcp/setup-with-agent.md

Tools

5 MCP tools (3 domain + config + help). The legacy setup tool merged into config action dispatch.

Tool Description
search Web (SearXNG metasearch), news, images, academic research (Scholar / arXiv / PubMed / CrossRef / Semantic Scholar / BASE), library docs (HyDE + FTS5), find similar pages. Phase 2 adds docs_resolve (library name -> ranked id), docs_query (version-aware + topic + 5000-token cap), docs_lock_project (Cabinets project pin via pyproject / package.json / go.mod / Cargo.toml manifest detection).
extract URL -> smart chunks dict (clean_text + markdown + structured_data + code_blocks + metadata) via web-core 5-strategy chain. Batch processing (up to 50 URLs), deep crawling, site mapping, local file conversion (PDF/DOCX/XLSX/PPTX/EPUB), structured extraction (JSON Schema)
media list (discover URLs from gallery pages), download (SSRF-safe). analyze deprecated v<auto>+ -- forwards to imagine-mcp.understand
config status, set, cache_clear, docs_reindex, warmup, setup_open_relay, setup_status, setup_skip, setup_reset, setup_complete, setup_sync
help Per-tool documentation: search, extract, media, config

Media boundary: For vision / audio understanding (image captioning, OCR, audio transcription, video summarization), use imagine-mcp. media.analyze in wet has been deprecated since v<auto> and will be removed in wet v2.0.0 (Phase 3).

Comparison

How wet-mcp stacks up against direct competitors in each pillar:

Capability wet-mcp Brave Search Tavily Firecrawl Context7
Web search Yes (SearXNG aggregation) Yes Yes No No
Extract URL Yes (5-strategy chain) No Yes (basic) Yes No
Media list / download Yes No No No No
Library docs search Yes (Tier 1 curated + Tier 2 on-demand, version-aware, Cabinets) No No No Yes
Academic research Yes (6 providers) No No No No
Self-hostable Yes No No No Yes
Free tier Yes (open source) Limited Limited Limited Yes

Security

  • SSRF prevention -- URL validation on crawl targets
  • Graceful fallbacks -- Cloud → Local embedding, multi-tier crawling
  • Error sanitization -- No credentials in error messages
  • File conversion sandboxing -- Optional CONVERT_ALLOWED_DIRS restriction

Build from Source

git clone https://github.com/n24q02m/wet-mcp.git
cd wet-mcp
uv sync
uv run wet-mcp

Trust Model

This plugin implements TC-Local (machine-bound, single trust principal). See mcp-core/docs/TRUST-MODEL.md for full classification.

Mode Storage Encryption Who can read your data?
stdio (default) ~/.wet-mcp/config.json AES-GCM, machine-bound key Only your OS user (file perm 0600)
HTTP self-host Same as stdio Same Only you (admin = user)

License

MIT -- See LICENSE.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wet_mcp-3.0.0b1.tar.gz (165.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wet_mcp-3.0.0b1-py3-none-any.whl (179.6 kB view details)

Uploaded Python 3

File details

Details for the file wet_mcp-3.0.0b1.tar.gz.

File metadata

  • Download URL: wet_mcp-3.0.0b1.tar.gz
  • Upload date:
  • Size: 165.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-3.0.0b1.tar.gz
Algorithm Hash digest
SHA256 dd845b8805d90319713a3a054d424b3330658055eb808b995f8bde3190a5b582
MD5 79681aee97fff7baa34083ba582eebc0
BLAKE2b-256 0381b2c5bc8923b6d378451f72bcedfc776322d96c4b03a8d130eb2d53d4d4d7

See more details on using hashes here.

File details

Details for the file wet_mcp-3.0.0b1-py3-none-any.whl.

File metadata

  • Download URL: wet_mcp-3.0.0b1-py3-none-any.whl
  • Upload date:
  • Size: 179.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-3.0.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 79178a710dc76793d6222c2263b021946331792a048372cb9a3e81cc5eda3a6d
MD5 b6639541502e8f4ecd9ebce0ba8cc0d7
BLAKE2b-256 dfd17a31f7c365cca40771705f0a12f4e17af968941b77d36824508746be037a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page