Comprehensive MCP server for web research with 13 tools, 4 resources, and 5 prompts: search, crawl, package info, GitHub stats, error translation, API docs discovery, and more.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

elad12390

These details have not been verified by PyPI

Project description

Web Research Assistant MCP Server

Comprehensive Model Context Protocol (MCP) server that provides web research and discovery capabilities. Includes 13 tools, 4 resources, and 5 prompts for searching, crawling, and analyzing web content, powered by your local Docker SearXNG instance, the crawl4ai project, and Pixabay API:

web_search — federated search across multiple engines via SearXNG
search_examples — find code examples, tutorials, and articles (defaults to recent content)
search_images — find high-quality stock photos, illustrations, and vectors via Pixabay
crawl_url — full page content extraction with advanced crawling
package_info — detailed package metadata from npm, PyPI, crates.io, Go
package_search — discover packages by keywords and functionality
github_repo — repository health metrics and development activity
translate_error — find solutions for error messages and stack traces from Stack Overflow (auto-detects CORS, fetch, and web errors)
api_docs — auto-discover and crawl official API documentation with examples (works for any API - no hardcoded URLs)
extract_data — extract structured data (tables, lists, fields, JSON-LD) from web pages with automatic detection
compare_tech — compare technologies side-by-side with NPM downloads, GitHub stars, and aspect analysis (React vs Vue, PostgreSQL vs MongoDB, etc.)
get_changelog — NEW! Get release notes and changelogs with breaking change detection (upgrade safely from version X to Y)
check_service_status — NEW! Instant health checks for 25+ services (Stripe, AWS, GitHub, OpenAI, etc.) - "Is it down or just me?"

All tools feature comprehensive error handling, response size limits, usage tracking, and clear documentation for optimal AI agent integration.

MCP Resources (Direct Data Lookups)

package://{registry}/{name} - Package info from npm, PyPI, crates.io, or Go modules
github://{owner}/{repo} - Repository information and health metrics
status://{service} - Service health status for 120+ services
changelog://{registry}/{package} - Package release notes and changelogs

MCP Prompts (Reusable Workflows)

research_package - Comprehensive package evaluation
debug_error - Structured error debugging with solutions
compare_technologies - Side-by-side technology comparison
evaluate_repository - GitHub repository health assessment
check_service_health - Multi-service status monitoring

Quick Start

Set up SearXNG (5 minutes):
```
# Using Docker (recommended)
docker run -d -p 2288:8080 searxng/searxng:latest
```
Then configure search engines - see SEARXNG_SETUP.md for optimized settings.

Install the MCP server:

uvx web-research-assistant  # or: pip install web-research-assistant

Configure Claude Desktop - add to claude_desktop_config.json:

{
  "mcpServers": {
    "web-research-assistant": {
      "command": "uvx",
      "args": ["web-research-assistant"]
    }
  }
}

Restart Claude Desktop and start researching!

⚠️ For best results: Configure SearXNG with GitHub, Stack Overflow, and other code-focused search engines. See SEARXNG_SETUP.md for the recommended configuration.

Prerequisites

Required

Python 3.10+
A running SearXNG instance on http://localhost:2288
- 📖 See SEARXNG_SETUP.md for complete Docker setup guide
- ⚠️ IMPORTANT: For best results, enable these search engines in SearXNG:
  - GitHub, Stack Overflow, GitLab (for code search - critical!)
  - DuckDuckGo, Brave (for web search)
  - MDN, Wikipedia (for documentation)
  - Reddit, HackerNews (for tutorials and discussions)
  - See SEARXNG_SETUP.md for the full optimized configuration

Optional

Pixabay API key for image search - Get free key
Playwright browsers for advanced crawling (auto-installed with crawl4ai-setup)

Developer Setup (if running from source)

uv tool install uv  # if you do not already have uv
uv sync              # creates the virtual environment
uv run crawl4ai-setup  # installs Chromium for crawl4ai

You can also use pip install -r requirements.txt if you prefer pip over uv.

Installation

Option 1: Using uvx (Recommended - No installation needed!)

uvx web-research-assistant

This runs the server directly from PyPI without installing it globally.

Option 2: Install with pip

pip install web-research-assistant
web-research-assistant

Option 3: Install with uv

uv tool install web-research-assistant
web-research-assistant

By default the server communicates over stdio, which makes it easy to wire into Claude Desktop or any other MCP host.

MCP Client Configuration

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

Option 1: Using uvx (Recommended - No installation needed!)

{
  "mcpServers": {
    "web-research-assistant": {
      "command": "uvx",
      "args": ["web-research-assistant"]
    }
  }
}

Option 2: Using installed package

{
  "mcpServers": {
    "web-research-assistant": {
      "command": "web-research-assistant"
    }
  }
}

OpenCode

Add to ~/.config/opencode/opencode.json:

Using uvx (Recommended)

{
  "mcp": {
    "web-research-assistant": {
      "type": "local",
      "command": ["uvx", "web-research-assistant"],
      "enabled": true
    }
  }
}

Using installed package

{
  "mcp": {
    "web-research-assistant": {
      "type": "local",
      "command": ["web-research-assistant"],
      "enabled": true
    }
  }
}

Development (Running from source)

For Claude Desktop:

{
  "mcpServers": {
    "web-research-assistant": {
      "command": "uv",
      "args": [
        "--directory",
        "/ABSOLUTE/PATH/TO/web-research-assistant",
        "run",
        "web-research-assistant"
      ]
    }
  }
}

For OpenCode:

{
  "mcp": {
    "web-research-assistant": {
      "type": "local",
      "command": [
        "uv",
        "--directory",
        "/ABSOLUTE/PATH/TO/web-research-assistant",
        "run",
        "web-research-assistant"
      ],
      "enabled": true
    }
  }
}

Restart your MCP client afterwards. The MCP tools will be available immediately.

Tool behavior

Tool	When to use	Arguments
`web_search`	Use first to gather recent information and URLs from SearXNG. Returns 1–10 ranked snippets with clickable URLs.	`query` (required), `reasoning` (required), optional `category` (defaults to `general`), and `max_results` (defaults to 5).
`search_examples`	Find code examples, tutorials, and technical articles. Optimized for technical content with optional time filtering. Perfect for learning APIs or finding usage patterns.	`query` (required, e.g., "Python async examples"), `reasoning` (required), `content_type` (code/articles/both, defaults to both), `time_range` (day/week/month/year/all, defaults to all), optional `max_results` (defaults to 5).
`search_images`	Find high-quality royalty-free stock images from Pixabay. Returns photos, illustrations, or vectors. Requires `PIXABAY_API_KEY` environment variable.	`query` (required, e.g., "mountain landscape"), `reasoning` (required), `image_type` (all/photo/illustration/vector, defaults to all), `orientation` (all/horizontal/vertical, defaults to all), optional `max_results` (defaults to 10).
`crawl_url`	Call immediately after search when you need the actual article body for quoting, summarizing, or extracting data.	`url` (required), `reasoning` (required), optional `max_chars` (defaults to 8000 characters).
`package_info`	Look up specific npm, PyPI, crates.io, or Go package metadata including version, downloads, license, and dependencies. Use when you know the package name.	`name` (required package name), `reasoning` (required), `registry` (npm/pypi/crates/go, defaults to npm).
`package_search`	Search for packages by keywords or functionality (e.g., "web framework", "json parser"). Use when you need to find packages that solve a specific problem.	`query` (required search terms), `reasoning` (required), `registry` (npm/pypi/crates/go, defaults to npm), optional `max_results` (defaults to 5).
`github_repo`	Get GitHub repository health metrics including stars, forks, issues, recent commits, and project details. Use when evaluating open source projects.	`repo` (required, owner/repo or full URL), `reasoning` (required), optional `include_commits` (defaults to true).
`translate_error`	Find Stack Overflow solutions for error messages and stack traces. Auto-detects language/framework, extracts key terms (CORS, map, undefined, etc.), filters irrelevant results, and prioritizes Stack Overflow solutions. Handles web-specific errors (CORS, fetch).	`error_message` (required stack trace or error text), `reasoning` (required), optional `language` (auto-detected), optional `framework` (auto-detected), optional `max_results` (defaults to 5).
`api_docs`	Auto-discover and crawl official API documentation. Dynamically finds docs URLs using patterns (docs.{api}.com, {api}.com/docs, etc.), searches for specific topics, crawls pages, and extracts overview, parameters, examples, and related links. Works for ANY API - no hardcoded URLs. Perfect for API integration and learning.	`api_name` (required, e.g., "stripe", "react"), `topic` (required, e.g., "create customer", "hooks"), `reasoning` (required), optional `max_results` (defaults to 2 pages).
`extract_data`	Extract structured data from HTML pages. Supports tables, lists, fields (via CSS selectors), JSON-LD, and auto-detection. Returns clean JSON output. More efficient than parsing full page text. Perfect for scraping pricing tables, package specs, release notes, or any structured content.	`url` (required), `reasoning` (required), `extract_type` (table/list/fields/json-ld/auto, defaults to auto), optional `selectors` (CSS selectors for fields mode), optional `max_items` (defaults to 100).
`compare_tech`	Compare 2-5 technologies side-by-side. Auto-detects category (framework/database/language) and gathers data from NPM, GitHub, and web search. Returns structured comparison with popularity metrics (downloads, stars), performance insights, and best-use summaries. Fast parallel processing (3-4s).	`technologies` (required list of 2-5 names), `reasoning` (required), optional `category` (auto-detects if not provided), optional `aspects` (auto-selected by category), optional `max_results_per_tech` (defaults to 3).
`get_changelog`	NEW! Get release notes and changelogs for package upgrades. Fetches GitHub releases, highlights breaking changes, and provides upgrade recommendations. Answers "What changed in version X → Y?" and "Are there breaking changes?" Perfect for planning dependency updates.	`package` (required name), `reasoning` (required), optional `registry` (npm/pypi/auto, defaults to auto), optional `max_releases` (defaults to 5).
`check_service_status`	NEW! Instantly check if external services are experiencing issues. Covers 25+ popular services (Stripe, AWS, GitHub, OpenAI, Vercel, etc.). Returns operational status, current incidents, and component health. Critical for production debugging - know immediately if the issue is external. Response time < 2s.	`service` (required name, e.g., "stripe", "aws"), `reasoning` (required).

Results are automatically trimmed (default 8 KB) so they stay well within MCP response expectations. If truncation happens, the text ends with a note reminding the model that more detail is available on request.

Resources

MCP Resources provide direct data access via URI templates - perfect for quick lookups without tool calls.

Resource URI	Description	Example
`package://{registry}/{name}`	Package metadata (version, downloads, license, dependencies)	`package://npm/express`
`github://{owner}/{repo}`	Repository info (stars, forks, issues, activity)	`github://facebook/react`
`status://{service}`	Service health status	`status://stripe`
`changelog://{registry}/{package}`	Release notes and changelogs	`changelog://npm/typescript`

Prompts

MCP Prompts are reusable message templates that guide AI assistants through common workflows.

Prompt	Arguments	Use Case
`research_package`	`package_name`, `registry`	Evaluate a package before adding it as a dependency
`debug_error`	`error_message`, `language` (optional), `framework` (optional)	Debug an error with context and solutions
`compare_technologies`	`tech1`, `tech2`, `tech3` (optional), `tech4` (optional), `tech5` (optional)	Compare frameworks, databases, or languages
`evaluate_repository`	`owner`, `repo`	Assess a GitHub project's health and activity
`check_service_health`	`services` (comma-separated)	Monitor multiple services at once

Configuration

Environment variables let you adapt the server without touching code:

Variable	Default	Description
`SEARXNG_BASE_URL`	`http://localhost:2288/search`	Endpoint queried by `web_search`.
`SEARXNG_DEFAULT_CATEGORY`	`general`	Category used when none is provided.
`SEARXNG_DEFAULT_RESULTS`	`5`	Default number of search hits.
`SEARXNG_MAX_RESULTS`	`10`	Hard cap on hits per request.
`SEARXNG_CRAWL_MAX_CHARS`	`8000`	Default character budget for `crawl_url`.
`MCP_MAX_RESPONSE_CHARS`	`8000`	Overall response limit applied to every tool reply.
`SEARXNG_MCP_USER_AGENT`	`web-research-assistant/0.1`	User-Agent header for outward HTTP calls.
`PIXABAY_API_KEY`	(empty)	API key for Pixabay image search. Get free key at pixabay.com/api/docs.
`MCP_USAGE_LOG`	`~/.config/web-research-assistant/usage.json`	Location for usage analytics data.

Development

The codebase is intentionally modular and organized:

web-research-assistant/
├── src/searxng_mcp/     # Source code
│   ├── config.py        # Configuration and environment
│   ├── search.py        # SearXNG integration
│   ├── crawler.py       # Crawl4AI wrapper
│   ├── images.py        # Pixabay client
│   ├── registry.py      # Package registries (npm, PyPI, crates, Go)
│   ├── github.py        # GitHub API client
│   ├── errors.py        # Error parser (language/framework detection)
│   ├── api_docs.py      # API docs discovery (NO hardcoded URLs)
│   ├── tracking.py      # Usage analytics
│   └── server.py        # MCP server + 9 tools
├── docs/                # Documentation (27 files)
└── [config files]

Each module is well under 400 lines, making the codebase easy to understand and extend.

Usage Analytics

All tools automatically track usage metrics including:

Tool invocation counts and success rates
Response times and performance trends
Common use case patterns (via the reasoning parameter)
Error frequencies and types

Analytics data is stored in ~/.config/web-research-assistant/usage.json and can be analyzed to optimize tool usage and identify patterns. Each tool requires a reasoning parameter that helps categorize why tools are being used, enabling better analytics and insights.

Note: As of the latest update, the reasoning parameter is required for all tools (previously optional with defaults). This ensures meaningful analytics data collection.

Documentation

Comprehensive documentation is available in the docs/ directory:

Project Status - Current status, metrics, roadmap
API Docs Implementation - NEW tool documentation
Error Translator Design - Error translator details
Tool Ideas Ranked - Prioritization and progress
SearXNG Configuration - Recommended setup
Quick Start Examples - Usage examples

See the docs README for a complete index.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

elad12390

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

Dec 3, 2025

0.2.0

Dec 3, 2025

0.1.0

Nov 16, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

web_research_assistant-0.3.0.tar.gz (64.4 kB view details)

Uploaded Dec 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

web_research_assistant-0.3.0-py3-none-any.whl (60.8 kB view details)

Uploaded Dec 3, 2025 Python 3

File details

Details for the file web_research_assistant-0.3.0.tar.gz.

File metadata

Download URL: web_research_assistant-0.3.0.tar.gz
Upload date: Dec 3, 2025
Size: 64.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for web_research_assistant-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`7461c2088f7cd9ec41b0ba36bf713a780b7c01955a4988fcb3b74a584b2f4dda`
MD5	`3a07d86d079a746ec37833fec0427140`
BLAKE2b-256	`f0aec5a91386f6881b806e81d6c2241c230e2dab0efa0c713c7be91dfbe7a57a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for web_research_assistant-0.3.0.tar.gz:

Publisher: publish.yml on elad12390/web-research-assistant

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: web_research_assistant-0.3.0.tar.gz
- Subject digest: 7461c2088f7cd9ec41b0ba36bf713a780b7c01955a4988fcb3b74a584b2f4dda
- Sigstore transparency entry: 737489961
- Sigstore integration time: Dec 3, 2025
Source repository:
- Permalink: elad12390/web-research-assistant@36aa401f54ae0964d407f95c5504db6d1b25eb47
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/elad12390
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@36aa401f54ae0964d407f95c5504db6d1b25eb47
- Trigger Event: release

File details

Details for the file web_research_assistant-0.3.0-py3-none-any.whl.

File metadata

Download URL: web_research_assistant-0.3.0-py3-none-any.whl
Upload date: Dec 3, 2025
Size: 60.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for web_research_assistant-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2df8b50ce48c5ee1fee8cac16ae9fa68862dc0ac82444b293561293ddda861e6`
MD5	`6b4aa7cab8752f80bb23978e45d7ccb1`
BLAKE2b-256	`9f9223eea1591323ff2d1faeac2cc2175fab0fbf328ca163d328f8c59015d8e0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for web_research_assistant-0.3.0-py3-none-any.whl:

Publisher: publish.yml on elad12390/web-research-assistant

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: web_research_assistant-0.3.0-py3-none-any.whl
- Subject digest: 2df8b50ce48c5ee1fee8cac16ae9fa68862dc0ac82444b293561293ddda861e6
- Sigstore transparency entry: 737489963
- Sigstore integration time: Dec 3, 2025
Source repository:
- Permalink: elad12390/web-research-assistant@36aa401f54ae0964d407f95c5504db6d1b25eb47
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/elad12390
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@36aa401f54ae0964d407f95c5504db6d1b25eb47
- Trigger Event: release

web-research-assistant 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Web Research Assistant MCP Server

MCP Resources (Direct Data Lookups)

MCP Prompts (Reusable Workflows)

Quick Start

Prerequisites

Required

Optional

Developer Setup (if running from source)

Installation

Option 1: Using uvx (Recommended - No installation needed!)

Option 2: Install with pip

Option 3: Install with uv

MCP Client Configuration

Claude Desktop

OpenCode

Development (Running from source)

Tool behavior

Resources

Prompts

Configuration

Development

Usage Analytics

Documentation

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance