Open-source MCP Server for web search, extract, crawl, academic research, and library docs with embedded SearXNG

These details have not been verified by PyPI

Project links

Project description

WET - Web Extended Toolkit MCP Server

mcp-name: io.github.n24q02m/wet-mcp

Open-source MCP Server for web search, content extraction, library docs & multimodal analysis.

Features

Web Search -- Embedded SearXNG metasearch (Google, Bing, DuckDuckGo, Brave) with filters, semantic reranking, query expansion, and snippet enrichment
Academic Research -- Search Google Scholar, Semantic Scholar, arXiv, PubMed, CrossRef, BASE
Library Docs -- Auto-discover and index documentation with FTS5 hybrid search, HyDE-enhanced retrieval, and version-specific docs
Content Extract -- Clean content extraction (Markdown/Text), structured data extraction (LLM + JSON Schema), batch processing (up to 50 URLs), deep crawling, site mapping
Local File Conversion -- Convert PDF, DOCX, XLSX, CSV, HTML, EPUB, PPTX to Markdown
Media -- List, download, and analyze images, videos, audio files
Anti-bot -- Stealth mode bypasses Cloudflare, Medium, LinkedIn, Twitter
Zero Config -- Built-in local Qwen3 embedding + reranking, no API keys needed. Optional cloud providers (Jina AI, Gemini, OpenAI, Cohere)
Sync -- Cross-machine sync of indexed docs via rclone (Google Drive, S3, Dropbox)

Quick Start

Claude Code Plugin (Recommended)

claude plugin add n24q02m/wet-mcp

MCP Server

Python 3.13 required -- Python 3.14+ is not supported due to SearXNG incompatibility. You must specify --python 3.13 when using uvx.

On first run, the server automatically installs SearXNG, Playwright chromium, and starts the embedded search engine.

Option 1: uvx

{
  "mcpServers": {
    "wet": {
      "command": "uvx",
      "args": ["--python", "3.13", "wet-mcp@latest"],
      "env": {
        // -- optional: cloud embedding + reranking (Jina AI recommended)
        "API_KEYS": "JINA_AI_API_KEY:jina_...",
        // -- or: "API_KEYS": "GOOGLE_API_KEY:AIza...,COHERE_API_KEY:co-...",
        // -- without API_KEYS, uses built-in local Qwen3 ONNX models (CPU, ~570MB first download)
        // -- optional: LiteLLM Proxy (production, selfhosted gateway)
        // "LITELLM_PROXY_URL": "http://10.0.0.20:4000",
        // "LITELLM_PROXY_KEY": "sk-your-virtual-key",
        // -- optional: higher rate limits for docs discovery (60 -> 5000 req/hr)
        "GITHUB_TOKEN": "ghp_...",
        // -- optional: restrict local file conversion to specific directories
        // "CONVERT_ALLOWED_DIRS": "/home/user/docs,/tmp/uploads",
        // -- optional: sync indexed docs across machines via rclone
        "SYNC_ENABLED": "true",                    // default: false
        "SYNC_INTERVAL": "300"                     // auto-sync every 5min (0 = manual only)
      }
    }
  }
}

Option 2: Docker

{
  "mcpServers": {
    "wet": {
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "--name", "mcp-wet",
        "-v", "wet-data:/data",
        "-e", "API_KEYS",
        "-e", "GITHUB_TOKEN",
        "-e", "SYNC_ENABLED",
        "-e", "SYNC_INTERVAL",
        "n24q02m/wet-mcp:latest"
      ],
      "env": {
        "API_KEYS": "JINA_AI_API_KEY:jina_...",
        "GITHUB_TOKEN": "ghp_...",
        "SYNC_ENABLED": "true",
        "SYNC_INTERVAL": "300"
      }
    }
  }
}

Pre-install (optional)

Use the setup MCP tool to warmup models and install dependencies:

# Via MCP tool call (recommended):
setup(action="warmup")

# With cloud embedding configured, warmup validates API keys
# and skips local model download if cloud models are available.

The warmup action pre-downloads SearXNG, Playwright, and embedding/reranker models (~1.1GB total) so the first real connection does not timeout.

Sync setup

Sync is fully automatic. Just set SYNC_ENABLED=true and the server handles everything:

First sync: rclone is auto-downloaded, a browser opens for OAuth authentication
Token saved: OAuth token is stored locally at ~/.wet-mcp/tokens/ (600 permissions)
Subsequent runs: Token is loaded automatically -- no manual steps needed

For non-Google Drive providers, set SYNC_PROVIDER and SYNC_REMOTE:

{
  "SYNC_ENABLED": "true",
  "SYNC_PROVIDER": "dropbox",
  "SYNC_REMOTE": "dropbox"
}

Tools

Tool	Actions	Description
`search`	`search`, `research`, `docs`, `similar`	Web search (with filters, reranking, expand/enrich), academic research, library docs (HyDE), find similar
`extract`	`extract`, `batch`, `crawl`, `map`, `convert`, `extract_structured`	Content extraction, batch processing (up to 50 URLs), deep crawling, site mapping, local file conversion, structured data extraction (JSON Schema)
`media`	`list`, `download`, `analyze`	Media discovery, download, and analysis
`config`	`status`, `set`, `cache_clear`, `docs_reindex`	Server configuration and cache management
`setup`	`warmup`, `setup_sync`	Pre-download models, configure cloud sync
`help`	--	Full documentation for any tool

Configuration

Variable	Required	Default	Description
`API_KEYS`	No	--	LLM API keys for SDK mode (format: `ENV_VAR:key,...`). Enables cloud embedding + reranking
`LITELLM_PROXY_URL`	No	--	LiteLLM Proxy URL. Enables proxy mode
`LITELLM_PROXY_KEY`	No	--	LiteLLM Proxy virtual key
`GITHUB_TOKEN`	No	auto-detect	GitHub token for docs discovery (60 -> 5000 req/hr). Auto-detected from `gh auth token`
`EMBEDDING_BACKEND`	No	auto-detect	`litellm` (cloud) or `local` (Qwen3). Auto: API_KEYS -> litellm, else local
`EMBEDDING_MODEL`	No	auto-detect	LiteLLM embedding model name
`EMBEDDING_DIMS`	No	`0` (auto=768)	Embedding dimensions
`RERANK_ENABLED`	No	`true`	Enable reranking after search
`RERANK_BACKEND`	No	auto-detect	`litellm` or `local`. Auto: Cohere/Jina key -> litellm, else local
`RERANK_MODEL`	No	auto-detect	LiteLLM rerank model name
`RERANK_TOP_N`	No	`10`	Return top N results after reranking
`LLM_MODELS`	No	`gemini/gemini-3-flash-preview`	LiteLLM model for media analysis
`WET_AUTO_SEARXNG`	No	`true`	Auto-start embedded SearXNG subprocess
`WET_SEARXNG_PORT`	No	`41592`	SearXNG port
`SEARXNG_URL`	No	`http://localhost:41592`	External SearXNG URL (when auto disabled)
`SEARXNG_TIMEOUT`	No	`30`	SearXNG request timeout in seconds
`CONVERT_MAX_FILE_SIZE`	No	`104857600`	Max file size for local conversion in bytes (100MB)
`CONVERT_ALLOWED_DIRS`	No	--	Comma-separated paths to restrict local file conversion
`CACHE_DIR`	No	`~/.wet-mcp`	Data directory for cache, docs, downloads
`DOCS_DB_PATH`	No	`~/.wet-mcp/docs.db`	Docs database location
`DOWNLOAD_DIR`	No	`~/.wet-mcp/downloads`	Media download directory
`TOOL_TIMEOUT`	No	`120`	Tool execution timeout in seconds (0=no timeout)
`WET_CACHE`	No	`true`	Enable/disable web cache
`SYNC_ENABLED`	No	`false`	Enable rclone sync
`SYNC_PROVIDER`	No	`drive`	rclone provider type (drive, dropbox, s3, etc.)
`SYNC_REMOTE`	No	`gdrive`	rclone remote name
`SYNC_FOLDER`	No	`wet-mcp`	Remote folder name
`SYNC_INTERVAL`	No	`300`	Auto-sync interval in seconds (0=manual)
`LOG_LEVEL`	No	`INFO`	Logging level

Embedding & Reranking

Both embedding and reranking are always available -- local models are built-in and require no configuration.

Jina AI (recommended): A single JINA_AI_API_KEY enables both embedding and reranking
Embedding priority: Jina AI > Gemini > OpenAI > Cohere. Local Qwen3 fallback always available
Reranking priority: Jina AI > Cohere. Local Qwen3 fallback always available
GPU auto-detection: CUDA/DirectML auto-detected, uses GGUF models for better performance
All embeddings stored at 768 dims. Switching providers never breaks the vector table

LLM Configuration (3-Mode Architecture)

Priority	Mode	Config	Use case
1	Proxy	`LITELLM_PROXY_URL` + `LITELLM_PROXY_KEY`	Production (selfhosted gateway)
2	SDK	`API_KEYS`	Dev/local with direct API access
3	Local	Nothing needed	Offline, embedding/rerank only (no LLM)

SearXNG Configuration (2-Mode)

Mode	Config	Description
Embedded (default)	`WET_AUTO_SEARXNG=true`	Auto-installs and manages SearXNG as subprocess
External	`WET_AUTO_SEARXNG=false` + `SEARXNG_URL=http://host:port`	Connects to pre-existing SearXNG instance

Build from Source

git clone https://github.com/n24q02m/wet-mcp.git
cd wet-mcp
uv sync
uv run wet-mcp

Compatible With

Also by n24q02m

Server	Description
mnemo-mcp	Persistent AI memory with hybrid search and cross-machine sync
better-notion-mcp	Markdown-first Notion API with 9 composite tools
better-email-mcp	Email (IMAP/SMTP) with multi-account and auto-discovery
better-godot-mcp	Godot Engine 4.x with 18 tools for scenes, scripts, and shaders
better-telegram-mcp	Telegram dual-mode (Bot API + MTProto) with 6 composite tools
better-code-review-graph	Knowledge graph for token-efficient code reviews

Contributing

See CONTRIBUTING.md.

License

MIT -- See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

3.1.0b4 pre-release

Feb 8, 2026

3.1.0b3 pre-release

Feb 8, 2026

3.1.0b2 pre-release

Feb 8, 2026

3.1.0b1 pre-release

Feb 8, 2026

3.1.0b0 pre-release

Feb 8, 2026

3.0.1b0 pre-release

Feb 6, 2026

3.0.0 yanked

Feb 6, 2026

2.25.2

Apr 17, 2026

2.25.1

Apr 17, 2026

2.25.0

Apr 13, 2026

2.24.0

Apr 7, 2026

2.23.5b3 pre-release

Apr 7, 2026

2.23.5b2 pre-release

Apr 7, 2026

2.23.5b1 pre-release

Apr 7, 2026

2.23.4

Apr 7, 2026

2.23.3

Apr 7, 2026

2.23.2

Apr 6, 2026

2.23.1

Apr 6, 2026

2.23.0

Apr 6, 2026

2.22.0

Apr 6, 2026

2.21.0

Apr 4, 2026

2.20.1

Apr 3, 2026

2.20.0

Apr 3, 2026

2.20.0b1 pre-release

Apr 3, 2026

2.19.0

Apr 1, 2026

2.19.0b1 pre-release

Mar 31, 2026

2.18.2b2 pre-release

Mar 31, 2026

2.18.2b1 pre-release

Mar 30, 2026

2.18.1

Mar 28, 2026

2.18.0

Mar 27, 2026

2.18.0b1 pre-release

Mar 27, 2026

2.17.0

Mar 26, 2026

2.17.0b1 pre-release

Mar 25, 2026

2.15.0

Mar 24, 2026

This version

2.15.0b1 pre-release

Mar 23, 2026

2.14.2

Mar 20, 2026

2.14.1

Mar 20, 2026

2.14.0

Mar 17, 2026

2.13.0

Mar 11, 2026

2.13.0b1 pre-release

Mar 11, 2026

2.12.0

Mar 10, 2026

2.11.1

Mar 8, 2026

2.11.1b1 pre-release

Mar 8, 2026

2.11.0

Mar 8, 2026

2.11.0b2 pre-release

Mar 8, 2026

2.11.0b1 pre-release

Mar 7, 2026

2.10.12

Mar 7, 2026

2.10.11

Mar 6, 2026

2.10.10

Mar 6, 2026

2.10.9

Mar 6, 2026

2.10.8

Mar 6, 2026

2.10.7

Mar 6, 2026

2.10.6

Mar 6, 2026

2.10.5

Mar 6, 2026

2.10.4

Mar 6, 2026

2.10.3

Mar 6, 2026

2.10.2

Mar 6, 2026

2.10.1

Mar 6, 2026

2.10.0

Mar 5, 2026

2.9.8

Mar 4, 2026

2.9.7

Mar 4, 2026

2.9.6

Mar 4, 2026

2.9.5

Mar 3, 2026

2.9.4

Feb 28, 2026

2.9.3

Feb 27, 2026

2.9.2

Feb 27, 2026

2.9.1

Feb 27, 2026

2.9.0

Feb 27, 2026

2.8.0

Feb 25, 2026

2.7.0

Feb 23, 2026

2.6.3

Feb 20, 2026

2.6.2

Feb 19, 2026

2.6.1

Feb 18, 2026

2.6.0

Feb 18, 2026

2.6.0b4 pre-release

Feb 18, 2026

2.6.0b3 pre-release

Feb 18, 2026

2.6.0b2 pre-release

Feb 18, 2026

2.6.0b1 pre-release

Feb 18, 2026

2.5.2b2 pre-release

Feb 17, 2026

2.5.2b1 pre-release

Feb 17, 2026

2.5.1

Feb 17, 2026

2.5.0b8 pre-release

Feb 14, 2026

2.5.0b7 pre-release

Feb 14, 2026

2.5.0b6 pre-release

Feb 14, 2026

2.5.0b5 pre-release

Feb 13, 2026

2.5.0b4 pre-release

Feb 13, 2026

2.5.0b3 pre-release

Feb 13, 2026

2.5.0b2 pre-release

Feb 13, 2026

2.5.0b1 pre-release

Feb 13, 2026

2.5.0b0 pre-release

Feb 13, 2026

2.4.1

Feb 12, 2026

2.4.0

Feb 12, 2026

2.4.0b5 pre-release

Feb 12, 2026

2.4.0b4 pre-release

Feb 12, 2026

2.4.0b3 pre-release

Feb 12, 2026

2.4.0b2 pre-release

Feb 12, 2026

2.4.0b1 pre-release

Feb 12, 2026

2.4.0b0 pre-release

Feb 10, 2026

2.3.0

Feb 9, 2026

2.3.0b0 pre-release

Feb 9, 2026

2.2.0

Feb 8, 2026

2.2.0b2 pre-release

Mar 25, 2026

2.2.0b1 pre-release

Mar 25, 2026

2.1.4b4 pre-release

Feb 5, 2026

2.1.4b3 pre-release

Feb 5, 2026

2.1.4b2 pre-release

Feb 5, 2026

2.1.3

Feb 4, 2026

2.1.2

Feb 4, 2026

2.1.1

Feb 4, 2026

2.1.0

Feb 4, 2026

2.0.0

Feb 4, 2026

1.3.0

Feb 3, 2026

1.2.1

Feb 3, 2026

1.2.0

Feb 3, 2026

1.1.0

Feb 3, 2026

1.0.0

Feb 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wet_mcp-2.15.0b1.tar.gz (117.8 kB view details)

Uploaded Mar 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

wet_mcp-2.15.0b1-py3-none-any.whl (132.4 kB view details)

Uploaded Mar 23, 2026 Python 3

File details

Details for the file wet_mcp-2.15.0b1.tar.gz.

File metadata

Download URL: wet_mcp-2.15.0b1.tar.gz
Upload date: Mar 23, 2026
Size: 117.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-2.15.0b1.tar.gz
Algorithm	Hash digest
SHA256	`ca5c633b5ecd5ec8707a1b23c4f493a8c4f7bfae33d9ac16bdac6e3d96493b7d`
MD5	`f28f657230f6eea78f3922def103b21a`
BLAKE2b-256	`840ece82fe25499bd6e32c3738009fe116779e5552789b0dcf2cf47f640f1d1a`

See more details on using hashes here.

File details

Details for the file wet_mcp-2.15.0b1-py3-none-any.whl.

File metadata

Download URL: wet_mcp-2.15.0b1-py3-none-any.whl
Upload date: Mar 23, 2026
Size: 132.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wet_mcp-2.15.0b1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b663159162fe8702740f5807c06deb13aee0a2ffcb8f1728cb240ce8ca836ee7`
MD5	`b3a01c56a965b16d124469a226ff77eb`
BLAKE2b-256	`6694d2879e97d49b9d78031a4687a8197f7ae50a4f6d331a1b500018c1fa5b04`

See more details on using hashes here.

wet-mcp 2.15.0b1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

WET - Web Extended Toolkit MCP Server

Features

Quick Start

Claude Code Plugin (Recommended)

MCP Server

Option 1: uvx

Option 2: Docker

Pre-install (optional)

Sync setup

Tools

Configuration

Embedding & Reranking

LLM Configuration (3-Mode Architecture)

SearXNG Configuration (2-Mode)

Build from Source

Compatible With

Also by n24q02m

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes